Analytics Optimization With Columnstore Indexes in Microsoft SQL Server
Analytics Optimization With Columnstore Indexes in Microsoft SQL Server
Optimization with
Columnstore
Indexes in Microsoft
SQL Server
Optimizing OLAP Workloads
—
Edward Pollack
Analytics Optimization
with Columnstore Indexes
in Microsoft SQL Server
Optimizing OLAP Workloads
Edward Pollack
Analytics Optimization with Columnstore Indexes in Microsoft SQL Server:
Optimizing OLAP Workloads
Edward Pollack
Albany, NY, USA
Introduction�����������������������������������������������������������������������������������������������������������xvii
v
Table of Contents
vi
Table of Contents
vii
Table of Contents
viii
Table of Contents
Index��������������������������������������������������������������������������������������������������������������������� 275
ix
About the Author
Edward Pollack has over 20 years of experience in database and systems administration,
architecture, and development, becoming an advocate for designing efficient data
structures that can withstand the test of time. He has spoken at many events, such as
SQL Saturdays, PASS Data Community Summit, Dativerse, and at many user groups and
is the organizer of SQL Saturday Albany. Edward has authored many articles, as well as
the book Dynamic SQL: Applications, Performance, and Security, and a chapter in Expert
T-SQL Window Functions in SQL Server. His first patent was issued in 2021, focused on
the compression of geographical data for use by analytic systems.
In his free time, Ed enjoys video games, sci-fi and fantasy, traveling, and baking.
He lives in the sometimes-frozen icescape of Albany, NY, with his wife Theresa and sons
Nolan and Oliver, and a mountain of (his) video game plushies that help break the fall
when tripping on (their) toys.
xi
About the Technical Reviewer
Borbala Toth-Apathy is a database professional with nearly 20 years of experience in
the IT field. She has an MSc in Computer Science – unsurprisingly, as she’s been writing
code from a young age.
Specializing in databases after her degree, nowadays, she is designing and building
data warehouses. Her secret passion is data analytics.
In her free time, she likes calisthenics or yoga, nature, puzzles, and crafts.
xiii
Acknowledgments
A big shout-out to the SQL Server community and the many speakers, organizers,
colleagues, and others who have supported and advised me over the years and provided
opportunities to grow both personally and professionally.
A book doesn’t happen without dedicated reviewers and editors. Speakers do not
get the opportunity to share their knowledge and grow without organizers, sponsors,
and volunteers that provide those chances. Articles are not written without companies,
editors, and services to support and encourage that work.
To everyone that has provided those opportunities to be a part of your events,
publications, and groups: Thank you!!!
xv
Introduction
Analytic data is an ever-present challenge for developers, analysts, and data
professionals. Its rapid growth coupled with the constant organizational need for speedy
answers to complex questions leads to the question of how massive amounts of data can
be stored efficiently enough to allow for real-time analytics.
The goal is to provide enough theory, architecture, use cases, and code to enable the
immediate use of this feature in solving real-world problems.
Each chapter tackles a specific feature or data architecture need from a high-level
overview, then delving into extensive detail. While not all professionals may need to
know precisely how columnstore indexes compress their data, having that knowledge
available will provide value in the future when the need to improve performance arises.
Most examples in this book are derived from data in the WideWorldImportersDW
and WideWorldImporters databases. On their own, they do not provide enough data to
allow for meaningful examples of analytic data. To augment this, new tables are created,
copied, and modified to allow for objects that are large and interesting enough to be
useful in demonstrating columnstore indexes. While real-world tables can easily contain
xvii
Introduction
tens of billions of rows, the examples in this book settle for about 25 million rows. This
provides a healthy compromise between effective test data and tables that are so large
that building them takes a very long time.
Intended Audience
This book is intended for data professionals, architects, and developers as a tool to
tame reporting and analytic data quickly, inexpensively, and effectively. Topics in this
book range from introductory to quite advanced, allowing it to be used as a tool by both
beginners and those with years of data architecture experience under their belts.
Those who have little experience with columnstore storage technologies will learn
everything needed to implement this feature for the first time and begin realizing
its benefits. Data experts can use the feature and architecture details, as well as
demonstrations provided throughout this book as a tool to optimize existing data
structures.
Columnstore indexes are not only for reporting and analytic data. They can also be
used to table the complex organizational need for real-time operational analytics. When
analytic and transactional workloads mix, the risk exists for latency, blocking, locking,
and deadlocking. Developers and data professionals that work with mixed workloads
can benefit greatly from the use of columnstore indexes to help cover each workload
effectively. In addition to improving performance, these efforts can avoid the need to
purchase or architect new systems to manage portions of a mixed workload.
xviii
Introduction
xix
CHAPTER 1
Introduction to Analytic
Data in a Transactional
Database
Analytic data, by its nature, can be large and challenging to maintain and can grow
quickly over time. Similarly, its use increases with time as analysts and data scientists
find more ways to crunch it. There is a great convenience to having analytic data in close
proximity to its underlying transactional sources. Utility is also gained by choosing a
location for analytic data that can withstand the test of time, thus avoiding the need for
costly migrations if the data is unable to scale appropriately.
1
© Edward Pollack 2022
E. Pollack, Analytics Optimization with Columnstore Indexes in Microsoft SQL Server,
https://doi.org/10.1007/978-1-4842-8048-5_1
Chapter 1 Introduction to Analytic Data in a Transactional Database
Each of these options has advantages and disadvantages, but before diving into
specifics, it is important to review analytic data and how it looks.
Note that OLAP stands for “online analytic processing” and references analytic
tables and data, whereas OLTP stands for “online transactional processing” and refers to
transactional tables and data.
Figure 1-1. Illustration of the growth of data when the time dimension is added
Analytic data can also be quite wide. Even a small table can have many calculated
metrics derived from it, allowing its OLAP counterpart to contain significantly more
columns. For example, consider the hypothetical sales order table in Listing 1-1.
2
Chapter 1 Introduction to Analytic Data in a Transactional Database
This table in Listing 1-1 contains only eight columns that provide information on
orders, including the customer, product details, and timing/cost. After some time and
deliberation, an analytic team creates a data warehouse–style fact table based off of this
transactional table, as seen in Listing 1-2.
3
Chapter 1 Introduction to Analytic Data in a Transactional Database
Note that the analytic table in Listing 1-2 has a variety of additional metrics that are
used to summarize data elements as the source data is transformed from detail data into
daily aggregate data. Analysts could easily add many more columns if an organizational
need existed for them.
Because of these details, analytic data can easily grow and encompass millions or
billions of rows in a destination table. Similarly, an analytic table could contain far more
columns than its source transactional data. As a result, it should not be surprising that an
analytic table can potentially have many rows and/or many columns. Even a table that is
architected to not be wide (many columns) or deep (many rows) can change over time
as an organization grows and evolves. Therefore, having plans to manage larger data can
be beneficial, even if those plans are not implemented when an analytics project is first
completed.
• Statistical metrics
• Detail data
Statistical metrics are numeric fields that can be aggregated into meaningful derived
metrics. These may be dates, durations, measurements, rates, and more. In general,
these data types use fixed-length data types, are relatively small, and have a predictable
storage footprint.
Detail data consists of text-based data types, such as VARCHAR, JSON, XML, and
other types of markup. This data is typically not used directly in metrics or statistics, but
instead is further crunched for analysis after the fact. While detail data is important, it
is not the focus of this book. If a dimension exists with a large text-based data type, it is
worth considering the normalization of that column to reduce its footprint and allow for
easier management of detail values.
For example, the table in Listing 1-3 contains a large text column.
4
Chapter 1 Introduction to Analytic Data in a Transactional Database
Assuming that the data in the LogSource column is repeated often, normalizing
it into its own dimension table would save computing resources and allow for easier
analysis of distinct values or a subset of values, when needed. Listing 1-4 illustrates this
table after LogSource is normalized.
The LogSource column, which had been represented with a 250 character string, now
uses a SMALLINT to reference the lookup table LogSource instead. When normalizing
a column, ensure that the data type for the lookup key is neither too small (and could
run out of values) nor too large (wasting space). Normalizing dimensions is not always a
necessary action, but it can be beneficial when dimension values are often repeated and
the need to reference them is frequent.
5
Chapter 1 Introduction to Analytic Data in a Transactional Database
Data Warehouse
A data warehouse is a classic repository for OLAP data structures. Decades of experience
have gone into the development of exceptionally effective data warehousing software
with nearly every major data platform having its own proprietary warehousing solution.
Solutions such as Microsoft’s Azure Synapse, Amazon RedShift, and Google
BigQuery are examples of data warehousing solutions that are served up with the intent
of being used alongside other data storage components that these organizations offer.
The upside of using these technologies is that they are prebuilt, ready to use, and are
optimized for analytic data. The downside of these systems is that they involve learning
a new architecture, new scripting languages, and gaining enough expertise to maintain
them over time. Larger organizations can often afford to specialize in this regard, but
even then, the desire to increase complexity is sometimes a reason to not start with one
of these solutions.
Unstructured Data
Not all data should go into a SQL database or even a structured data source. Data that
is heavily comprised of text and markup is not easy to search within the confines of a
transactional database, even with features such as SQL Server’s Full-Text Search enabled.
When the data being worked with is dollars, time, or metrics, there is great flexibility
in where that data can effectively be stored, but large text and files make more sense
to be stored in an unstructured data repository, such as a data lake underpinned by
Hadoop, Hive, or a similar technology offered by a given software vendor.
If a data storage solution is needed and the source data is large and text based,
consider destinations outside of the confines of a transactional database. While this book
is not intended to explore these solutions, knowing when they should be considered is
helpful to avoid architecting around the wrong technology for that data.
6
Chapter 1 Introduction to Analytic Data in a Transactional Database
Like with a data warehouse solution, the primary benefit of a third-party analytic
solution is that it is built for this purpose and can be used with a relatively minimal
amount of effort. Unlike a data warehouse solution, analytics software varies greatly in
its structure, interface, and usage. An organization’s data analytic needs will dictate the
type of solution needed, and some analytics software will meet those needs while some
will not.
The downsides of using a third-party vendor are cost and vendor lock-in. Costs
typically increase as data volume and/or usage increases. Therefore, an analytics
project may begin inexpensively, but end up being quite costly when the data is ten
or a hundred (or a million?!) times larger than its initial size. When architecting any
data solution, be certain to determine cost based on current and future needs. This
ensures that there are no surprises when data and its usage inevitably grow by orders of
magnitude from where they begin.
Vendor lock-in may also pose a challenge. Moving data into an analytics solution
is easy, but is there a well-documented and convenient way to extract that data in
the future? If not, then moving away from that solution may become costly and
time-consuming. If those costs exceed the costs of remaining on that platform, then an
organization may have to begrudgingly choose to maintain its data there indefinitely.
For those that are comfortable and familiar with a given data analytics solution, this is
likely not an issue, though.
Reserve OLTP tables for small analytic data sets where scanning most (or all) of a
table is not prohibitively expensive. Alternatively, they can be used for lookup tables or
other small dimensions. For a previous example in Listing 1-3, the LogSource table used
a SMALLINT to number its rows. With a primary key that can contain no more than
65,535 values (-32,767 through 32,767), this table is destined to be relatively small. It is
compact enough that indexing the text column would not be prohibitively expensive,
if needed.
These are not trivial considerations and can make the effort to implement an analytic
solution fast, inexpensive, and easy to revise, if needed. In addition, performance can be
measured and optimized as needed to ensure that the end users of analytic data enjoy
acceptable access speeds. In addition, the processes that load data into analytic tables
can be fine-tuned to be exceptionally fast.
Given their simplicity and ease of implementation, it is worthwhile to take the
time and resources to create a proof of concept for storing analytic data in SQL Server,
specifically using columnstore indexes.
8
Chapter 1 Introduction to Analytic Data in a Transactional Database
Analytic data can be large, consume immense resources, and be challenging to move
when change is needed. Choosing the correct target data structure for analytic data
and making the best possible initial decision is critical to the long-term success of an
analytics project.
This book will explore in detail the nature of analytic workloads and why
columnstore indexes in SQL Server can provide a cost-effective and performant solution.
9
CHAPTER 2
T ransactional Data
The data that is readily created, altered, deleted, and read by applications is transactional
data. This data may not always be stored in structured databases or adhere to the mantra
of ACID transactions (Atomic, Consistent, Isolated, Durable), but for this discussion, it
will all be considered transactional data.
Consider an order system that handles product orders from a website and a popular
mobile app. This is decidedly a transactional system. Many people access the site
regularly and create orders, check their status, and await the arrival of what they’ve
ordered. The WideWorldImporters sample database contains an Orders table that is
representative of what would be seen in a typical order system, as seen in Figure 2-1.
11
© Edward Pollack 2022
E. Pollack, Analytics Optimization with Columnstore Indexes in Microsoft SQL Server,
https://doi.org/10.1007/978-1-4842-8048-5_2
Chapter 2 Transactional vs. Analytic Workloads
Figure 2-1. Example of an order table used by a busy order processing system
Note that there are many columns, each of which represents a detail of an order,
such as the salesperson, customer, contact, order date, and purchase order number.
Regardless of how large this table became over time, the average user would usually
not want to view more than a handful of rows at a time. The quantity of data accessed
is typically small as users will rarely view more than current or recent orders. If they do,
pre-existing filters and links would be built to ensure that data can be served up quickly.
Oftentimes, many columns are returned to the application at one time. When
viewing an order, it is far faster and efficient to return any column that may be needed
rather than be forced to return to the table multiple times to retrieve more columns.
Because users are interacting directly with an application, there is an expectation
that performance will be as fast as possible. Waiting a second or two for an order to
process would be seen as acceptable, whereas waiting 1 or 2 minutes would result in
immediate complaints and lost business. Similarly, because many users are accessing
the site at one time, contention is expected to be high. Since many people will be
accessing their orders at one time, the tables storing that data need to be capable of fast
reads and writes, even when thousands (or millions) of orders are being accessed at
one time.
In general, transactional queries tend to be simple. For example, a user that is
retrieving information on a current order may do so via the following query:
SELECT
Orders.OrderID,
Customers.CustomerName,
Orders.OrderDate,
Orders.ExpectedDeliveryDate,
Orders.CustomerPurchaseOrderNumber,
OrderLines.*
12
Chapter 2 Transactional vs. Analytic Workloads
FROM Sales.Orders
INNER JOIN Sales.Customers
ON Customers.CustomerID = Orders.CustomerID
INNER JOIN Sales.OrderLines
ON OrderLines.OrderID = Orders.OrderID
WHERE Customers.CustomerID = 10
AND Orders.OrderDate >= '5/20/2016'
AND Orders.OrderDate < '5/27/2016';
This is a relatively simple query that pulls 5 specific columns, as well as all 12
columns of line-item detail from 3 tables. Figure 2-2 shows the results of this query.
Note that only two rows are returned via a filter on both customer and date. It would
be unusual for a user of this system to request data on hundreds or thousands of orders
at once. Viewing so much data at one time would be challenging, and it is unlikely that
the user interface is built to allow so much data to be retrieved at one time anyway.
Any data that is served up by a transactional system needs to be accurate at runtime.
Someone checking their bank balance needs to get an accurate number without
exception. This means that data integrity needs to be enforced at runtime either by the
database, the application, or a combination of both. Inaccurate data in a transactional
system could have disastrous results if the application in question, for example, belongs
to a hospital, the military, or meteorologists.
Similarly, if a transactional operation fails, that failure needs to be addressed
immediately so that that user’s data is not left in an uncertain state. If a user renews their
passport online, but the payment never correctly processes, their application may not
be handled as it should be. Worse, there may not be an easy way for the user to fix it.
Therefore, the ability of that system to properly handle each step of that process together
as a unit is very important. The use of ACID for transactional data is heavily dependent
on the data and its use. A bank cannot afford an inconsistent transaction, but a social
media site probably can. Similarly, data that is eventually consistent will be perfectly
acceptable in some organizational models, whereas it would be destructive in others.
13
Chapter 2 Transactional vs. Analytic Workloads
OLTP data is often normalized to assist in ensuring relational integrity and the
ability to generate lists of lookup values efficiently. Normalization can save space and
memory in transactional applications as more verbose text columns are replaced with
numeric lookups. It also allows for easy lookups within the application by querying the
normalized table directly.
In summary, transactional data generally shares these characteristics:
• Many columns are returned at one time, for use by the application.
• Often normalized.
While these characteristics are generally true of transactional data, analytic data is
quite different and is worth exploring further in detail.
Analytic Data
While transactional data is found in live/busy systems processing orders or bank
operations, analytic data is found further downstream. This data is most often associated
with reporting, visualization, and analytics. It is what powers business intelligence, data
science, and the decision making that drives organizations worldwide.
OLAP systems are fundamentally different from transactional systems. To ensure
optimal speed and reliability, each needs to be architected, implemented, and managed
in distinctly different ways.
Analytics typically require queries that access large batches of data at one time,
but will need far fewer columns to perform the calculations needed to drive results.
For example, a common financial request may be to compare revenue for the current
14
Chapter 2 Transactional vs. Analytic Workloads
quarter with revenue from the previous quarter. The calculations needed to drive this
request would need to access all revenue data for each quarter or have that data
pre-crunched in a convenient location.
The following query is an example of an analytic data request that seeks to
understand order counts and totals over time:
SELECT
Date.[Calendar Year] AS Order_Year,
Date.[Calendar Month Number] AS Order_Month,
COUNT(*) AS Order_Count,
SUM(Quantity) AS Quantity_Total,
SUM([Total Excluding Tax]) AS [Total Excluding Tax]
FROM Fact.[Order]
INNER JOIN Dimension.Date
ON Date.Date = [Order].[Order Date Key]
WHERE [Order].[Order Date Key] >= '1/1/2016'
AND [Order].[Order Date Key] < '1/1/2017'
GROUP BY Date.[Calendar Year], Date.[Calendar Month Number]
ORDER BY Date.[Calendar Year], Date.[Calendar Month Number];
Note that only a handful of columns are requested, but that almost 30k orders were
queried to return the results shown in Figure 2-3.
A larger OLAP system could easily process millions of rows as a part of a single
request with similar-looking results. An effective analytic data store needs to be able
to scale to support millions (or billions) of rows in any given table without consuming
excessive system resources or queries taking too long to execute.
15
Chapter 2 Transactional vs. Analytic Workloads
Unlike transactional systems where data is written in small batches from many
sources, analytic data is typically created in larger bulk operations from a limited set
of organized data load processes. Inserting thousands (or millions) of rows at one
time needs to be fast and efficient as that will be the norm. The diagram in Figure 2-4
illustrates the differences between how data is written in each type of system.
It is important to note that not all OLTP and OLAP systems behave precisely like this,
but will generally follow these usage patterns.
Because analytic data is typically used for reporting, analytics, and data science, the
need for real-time results is less important. Many analytic processes are asynchronous
and not directly monitored by the user that will consume the resulting data. Because
OLAP data changes less often, complex data processing can occur prior to data usage,
ensuring that there is no need to wait too long for results, if the need for interactive
reporting is important.
Similarly, because the number of data sources, users, and apps accessing the data
is smaller, contention is far less of a challenge. A query may require reading millions of
rows from a table to return its results, but the quantity of queries accessing that table will
be orders of magnitude less than a busy transactional table.
16
Chapter 2 Transactional vs. Analytic Workloads
There is likely no need to preserve any of the partially loaded data, nor is the removal
of partially loaded data complex, as it is typically time-stamped. Many OLAP data load
processes will include steps that automatically remove any data previously loaded for
the same time period. This allows a failed process to be rerun without the prerequisite
cleanup step.
The consumers of analytic data are also quite different from those that use
transactional applications. Whereas an OLTP application may have a wide range of end
users that regularly interact directly with the software, analytic data consumers tend to
be analysts, domain experts, and business leaders. Domain experts are people that are
close enough to the data to understand its meaning and usage. For example, a network
administrator would be a domain expert for network traffic data, whereas a chief
financial officer would be a domain expert for quarterly earnings data. Domain experts
may have a clear understanding of data meaning, but they are less likely to be familiar
with how that data is stored and processed.
Unlike OLTP data, analytic data is often denormalized. Lookup tables may be
maintained for reporting purposes, but not used exclusively as lookups to fact tables.
Normalization does not always save significant storage or memory resources in OLAP
tables. Equally important, joining lookup tables during data load processes can greatly
17
Chapter 2 Transactional vs. Analytic Workloads
increase their latency, resulting in longer delays before new analytic data becomes
available. Most compression algorithms used in mainstream analytic data structures
utilize dictionary lookups to “normalize” data when it is physically stored. This means
that repeated values take up far less space than in a transactional table, regardless of
the column’s specific width. Despite that, dictionaries are limited in size and wider data
types will pose compression challenges, which will be tackled in Chapter 5.
Analytic data can be summarized as follows:
• Often denormalized.
The characteristics of analytic data are quite different from those of transactional
data. These different characteristics tend to lead toward using different systems to
store each.
18
Chapter 2 Transactional vs. Analytic Workloads
queries cause locking and blocking against objects that application users are trying
to write to. If a salesperson running an annual report is blocking the users trying to
purchase new products, then the result will be lost sales and disgruntled users.
Similarly, those large reports may consume vast amounts of memory, forcing
common transactional data out of the buffer pool in favor of reporting data. This will
further exacerbate slowness for application users.
In general, mixing analytic and transactional workloads may not always be
avoidable, but considering the ramifications of this architectural decision up front can
help in predicting (and planning for) problems before they become costly.
Many other considerations will vary between OLTP and OLAP systems, such as
• Maintenance
• High availability
• Disaster recovery
• Hardware resources
The architect of an application will need to ensure that each of these topics is visited
separately for both transactional and analytic workloads. While it is possible that similar
systems can be used when data sizes are small, it is unlikely that those same systems
will scale indefinitely. Eventually, data becomes large enough that unique solutions are
required to differentiate each type of workload.
The nuance of these decisions can travel deeper. For example, a small analytic
database would likely be architected differently than a large data repository. Therefore,
it’s important to consider size, scale, data growth, and workload together when
architecting new data structures or when revisiting existing database objects.
19
Chapter 2 Transactional vs. Analytic Workloads
Taking this information into account allows for a vision to be created of where
analytic data should be stored. The remainder of this book will focus on how
columnstore indexes in SQL Server can be used to effectively meet the needs of large
analytic tables and workloads. This technology is mature and can be used to effectively
manage a wide variety of analytic use cases, ensuring that it provides an ideal solution
to many of the analytics challenges that architects and administrators face on a
regular basis.
20
CHAPTER 3
In a clustered rowstore index, each row is stored on pages sequentially, one row after
another in the order prescribed by the clustered index. If many columns from a single
row or small subset of rows are required by a query, then this is an exceptionally efficient
21
© Edward Pollack 2022
E. Pollack, Analytics Optimization with Columnstore Indexes in Microsoft SQL Server,
https://doi.org/10.1007/978-1-4842-8048-5_3
Chapter 3 What Are Columnstore Indexes?
storage structure. For example, retrieving all columns from row 1 through row -5 would
require low effort as the data is contiguous and ordered. Figure 3-2 highlights the data
needed to satisfy this query.
Figure 3-2. How an OLTP query retrieving five rows would read a sample table
A table with sequential columns models how typical transactional queries operate
for a query such as this:
SELECT
OrderID, -- An identity/primary key column
CustomerID,
SalespersonPersonID,
ContactPersonID,
OrderDate
FROM Sales.Orders
WHERE OrderId = 289;
Transactional queries write single rows or small ranges of rows that often correlate
to a lookup on a single index. Here, a numeric identity is used to filter out five rows of
interest from a larger table.
The basic unit of SQL Server storage is the page. A page is comprised of 8 kilobytes
of data. Pages contain all data and index storage for a given table. If a single row is
required for a query, then the entire page it is stored on will be read into memory, even
if the remaining data on the page is not needed. Therefore, transactional queries rely
on clustered or nonclustered indexes to ensure that index seeks can return data in an
ordered fashion, such as the example in Figure 3-2. In this scenario, reading those five
rows will not require much more effort, even if the table grows to be significantly larger.
Analytic queries, though, are quite different. They often aggregate a select few
columns, but do so across many rows. Consider a typical OLAP query against the
example table where a single column is aggregated across a large portion of the table that
happens to include all of the rows shown here. Figure 3-3 shows how this would look.
22
Chapter 3 What Are Columnstore Indexes?
Figure 3-3. An analytic query aggregating a single column from a sample table
The physical layout of rows shows that a query requiring only column 2 needs to also
read the adjacent columns, even if not needed. The following is an example of a query
that accesses a large amount of data in a table, but only aggregates a single column.
SELECT
SUM(Quantity) AS Total_Quantity
FROM Sales.OrderLines
WHERE OrderID >= 1
AND OrderID < 10000;
Even though only a single column was summed, any page that contains values for
that column will need to be read into memory. Even a covering nonclustered index
would still necessitate reading data for each row, though the number of pages read can
be reduced in that fashion.
Now consider a larger table with 50 million rows that is stored on 250,000 pages
using a clustered rowstore index. The transactional query demonstrated in Figure 3-2
would still only read 5 rows that are stored consecutively and could therefore ignore
most of the remaining 249,995 rows and the pages they are stored on. Additional rows
would be read that happened to be stored on the same page(s) as those five rows, but the
added burden of that data is measured in kilobytes and is comparatively insignificant.
The analytic query presented in Figure 3-3 aggregates only a single column, but does
so across many rows. If the requested order data spans one-fourth of the table, then
processing this query in the transactional table would force SQL Server to read about
one-fourth of the pages in the table, since the values for column 2 are dispersed within
each row throughout the table. Regardless of the filter used that reduces the row count
needed for the query, every page in the range specified by the filter would need to be read.
As an analytic table continues to grow and spans billions of rows and/or terabytes of
storage (or more), the ability to read large swaths of its data becomes too slow and resource
intensive to be realistic. A better solution is needed that allows analytic queries to read
column data without the rest of the underlying columns being brought along for the ride.
23
Chapter 3 What Are Columnstore Indexes?
While this may sound like a sales pitch, there are no exaggerations here. Comparing
a large analytic data set stored in a classic rowstore clustered index vs. a columnstore
indexed table yields vast differences in storage and performance that will be
demonstrated and quantified throughout the remainder of this book.
Note that all demonstrations of columnstore indexes in this book are tested on
SQL Server 2019. Those running an earlier version should test thoroughly before
implementing any suggestions in this book as the features available may be different.
Figure 3-4 illustrates how data stored in a columnstore index is stored using the same
table from Figure 3-1.
24
Chapter 3 What Are Columnstore Indexes?
Note the critical difference: Data is ordered by column rather than by row. Each
value for column 1 is stored together in a single structure, whereas each other column
is stored in its own sets of pages. In the rowstore table shown in Figure 3-3, a single
column was aggregated, but every page had to be read in order to return data for that
one column.
Figure 3-5 shows the same query against an analytic columnstore indexed table.
25
Chapter 3 What Are Columnstore Indexes?
https://docs.microsoft.com/en-us/sql/relational-databases/indexes/
columnstore-indexes-what-s-new
The list is quite extensive and shows how columnstore indexes have evolved from an
inflexible read-only structure into one rich in features and optimizations.
Storing data natively in SQL Server means there is no need for third-party products,
no costly migrations, and no need to configure new hardware and software. The
time required to implement them is less guided by technology and more by typical
development and quality assurance needs. When building a project plan for the storage
26
Chapter 3 What Are Columnstore Indexes?
and maintenance of analytic data, factoring in these resource costs can help in making
an accurate,
fact-based decision. The following is a summary of these considerations:
Quantifying each of these factors can assist in comparing and contrasting different
solutions and will generally provide favorable results for an organization that already
uses SQL Server for its transactional data storage.
Scalability
Given the rapid rate in which analytic data can grow, any data structures used to store it
must be capable of efficiently servicing OLAP workloads, even when data depth or width
increases unexpectedly fast over time.
The ability for an analytic data storage solution to scale has many important benefits,
including
• Ensure high performance, even during periods of rapid data growth
For an OLAP solution to be effective, it needs to be fast and efficient with one
thousand rows, one million rows, one billion rows, or more. If a system is destined to
become inefficient when it gets large, then it is also destined to fail.
27
Chapter 3 What Are Columnstore Indexes?
• Growth
• Complexity
As an organization grows and serves more customers, those customers will generate
more data. Similarly, organizational growth almost always leads to both technical and
nontechnical processes becoming more complex. This complexity may manifest itself in
increased software features, more processes that require tracking, or requests for more
types of data to maintain and crunch over time.
The rate of growth of data can be summarized as follows:
1.00 * N * G * C
28
Chapter 3 What Are Columnstore Indexes?
per unit time will be added to an analytic data source. This can be
approximated in storage units as soon as the data types for new
metrics are known.
Each of these factors is defined here as linear in nature, but may not be linear
in actuality. Therefore, they should be revisited regularly to ensure that unexpected
changes in growth are accounted for when predicting future data size. Linear
approximations per unit time can be used to approximate nonlinear growth so long as
those approximations remain updated with current and future trends.
To provide a sample of the preceding formula, consider an analytic table that
contains 100,000,000 rows, currently grows by 250,000 rows per day, and is expected to
see its growth accelerate by an additional 25% per year, but not see any new dimensions
added in the foreseeable future. The estimated row count in 1 year would be given by
Note how quickly the added acceleration grew this data – from 100 million rows to
over 1.3 billion rows in 3 years! 1.9125 was calculated as the annual growth factor by
multiplying 250,000 rows per day by 365 days per year.
To summarize, any technology that manages analytic data needs to be capable of
efficient data access given that typical data growth can cause data sizes to balloon far
faster than conventional wisdom might suggest. The architecture of columnstore indexes
will be discussed in detail in Chapter 4 and will lay the foundation for explaining why
they can scale so effectively for rapidly growing data.
E xceptional Compression
Any analytic data store needs to take full advantage of compression to make its data
as compact as possible. Columnstore indexes use multiple compression algorithms to
achieve impressively high compression ratios. This can result in data that is 10–100 times
smaller than it would be in an uncompressed table.
29
Chapter 3 What Are Columnstore Indexes?
Compressing data by column rather than row is inherently more efficient for a
number of reasons:
• A single column contains values of the same data type, improving the
chances of values being repeated.
30
Chapter 3 What Are Columnstore Indexes?
Chapter 8 will fully explore how data is written to columnstore indexes, including
performance measurements, resource consumption details, and best practices for
loading data as efficiently as possible.
Analytic data requires a versatile, scalable, and performant solution. Columnstore
indexes provide an ideal data structure to create and analyze analytic data, and that can
manage data growth over time with ease.
The remainder of this book will discuss columnstore indexes in exhaustive detail,
providing architectural details, demonstrations, best practices, and tools that can
improve their use.
31
CHAPTER 4
Columnstore Index
Architecture
A solid understanding of the architecture of columnstore indexes is necessary to make
optimal use of them. Best practices, query patterns, maintenance, and troubleshooting
are all based on the internal structure of columnstore indexes. This chapter will focus on
these architectural components, providing the foundation for the rest of this book.
S
ample Data
To demonstrate the topics presented throughout this book, a sample data set will be
created based on the Fact.Sale table in the WideWorldImportersDW database.
The data set can be generated using the query in Listing 4-1.
Listing 4-1. Query Used to Generate a Data Set for Use in Columnstore
Index Testing
SELECT
Sale.[Sale Key], Sale.[City Key], Sale.[Customer Key], Sale.[Bill
To Customer Key], Sale.[Stock Item Key], Sale.[Invoice Date Key],
Sale.[Delivery Date Key],Sale.[Salesperson Key], Sale.[WWI
Invoice ID], Sale.Description, Sale.Package, Sale.Quantity,
Sale.[Unit Price], Sale.[Tax Rate],
Sale.[Total Excluding Tax], Sale.[Tax Amount], Sale.Profit,
Sale.[Total Including Tax], Sale.[Total Dry Items],
Sale.[Total Chiller Items], Sale.[Lineage Key]
33
© Edward Pollack 2022
E. Pollack, Analytics Optimization with Columnstore Indexes in Microsoft SQL Server,
https://doi.org/10.1007/978-1-4842-8048-5_4
Chapter 4 Columnstore Index Architecture
FROM Fact.Sale
CROSS JOIN
Dimension.City
WHERE City.[City Key] >= 1 AND City.[City Key] <= 110;
This generates 25,109,150 rows of data spanning invoice date ranges of 1/1/2013
through 5/31/2016. While this data is not massive, it is large enough for suitable
demonstrations without being cumbersome to those that wish to replicate it at home.
This data set will be reused in future chapters, being placed into a number of test tables
to illustrate a wide variety of topics related to columnstore indexes, OLAP performance,
and database architecture.
For this chapter, the data will be loaded into a table without any indexes, and a
columnstore index added at the end, as seen in Listing 4-2.
Listing 4-2. Script That Creates and Populates a Columnstore Index Test Table
34
Chapter 4 Columnstore Index Architecture
The size and shape of this data can be confirmed with a simple query:
SELECT
COUNT(*),
MIN([Invoice Date Key]),
MAX([Invoice Date Key])
FROM fact.Sale_CCI;
35
Chapter 4 Columnstore Index Architecture
Figure 4-1. Query results showing the size and date range for a set of test data
R
owgroups and Segments
Analytic data cannot be stored in one large contiguous structure. While each column is
compressed and stored separately, the data within those columns needs to be grouped
and compressed separately. Each compressed unit is what is read into memory. If
that unit is too small, then the storage and management of a multitude of compressed
structures would be expensive, and its compression poor. Conversely, if the number of
rows in each unit is too large, then the amount of data that needs to be read into memory
to satisfy queries would also become too large.
Columnstore indexes group rows into units of 220 (1,048,576) rows that are called
rowgroups. Each column within that rowgroup is individually compressed into the
fundamental unit of a columnstore index called a segment. This structure can be
visualized using the representation in Figure 4-2.
36
Chapter 4 Columnstore Index Architecture
Note that columnstore indexes are not built on binary tree structures like clustered
and nonclustered rowstore indexes are. Instead, each rowgroup contains a set of
compressed segments, one per column in a table. The example in Figure 4-2 is for a table
with eight columns and up to 6*220 rows, containing a total of 48 segments (one segment
per column per rowgroup). The only significant architectural convention shared
between rowstore and columnstore indexes are their use of 8KB pages to store their data.
Rowgroups are created and managed automatically as data is created in a
columnstore index. There is no cap on the number of rowgroups that can exist, nor are
there limits on the count of segments within the index. Because a rowgroup contains
up to 220 rows, a table should have many more rows than this to make optimal use of a
columnstore index. If a table has 500k rows, then it is likely to all be stored in a single
rowgroup. As a result, any query requiring data from the table would need to read
segments that contain data for all rows in the table. For a table to make effective use of
a columnstore index, it should have at least 5 million or 10 million rows so that it can
be broken into separate rowgroups of which not all need to be read each time a query is
issued against it.
The rowgroups within a columnstore index may be viewed using the dynamic
management view sys.column_store_row_groups. The query in Listing 4-3 returns
rowgroup metadata for the columnstore index created earlier in this chapter.
Listing 4-3. Script to Return Basic Rowgroup Metadata for a Columnstore Index
SELECT
tables.name AS table_name,
indexes.name AS index_name,
column_store_row_groups.partition_number,
column_store_row_groups.row_group_id,
column_store_row_groups.state_description,
column_store_row_groups.total_rows,
column_store_row_groups.deleted_rows,
column_store_row_groups.size_in_bytes
FROM sys.column_store_row_groups
INNER JOIN sys.indexes
ON indexes.index_id = column_store_row_groups.index_id
AND indexes.object_id = column_store_row_groups.object_id
INNER JOIN sys.tables
37
Chapter 4 Columnstore Index Architecture
ON tables.object_id = indexes.object_id
WHERE tables.name = 'Sale_CCI'
ORDER BY tables.object_id, indexes.index_id,
column_store_row_groups.row_group_id;
Listing 4-4. Script to Return Basic Segment Metadata for a Columnstore Index
SELECT
tables.name AS table_name,
indexes.name AS index_name,
columns.name AS column_name,
partitions.partition_number,
38
Chapter 4 Columnstore Index Architecture
column_store_segments.row_count,
column_store_segments.has_nulls,
column_store_segments.min_data_id,
column_store_segments.max_data_id,
column_store_segments.on_disk_size
FROM sys.column_store_segments
INNER JOIN sys.partitions
ON column_store_segments.hobt_id = partitions.hobt_id
INNER JOIN sys.indexes
ON indexes.index_id = partitions.index_id
AND indexes.object_id = partitions.object_id
INNER JOIN sys.tables
ON tables.object_id = indexes.object_id
INNER JOIN sys.columns
ON tables.object_id = columns.object_id
AND column_store_segments.column_id = columns.column_id
WHERE tables.name = 'Sale_CCI'
ORDER BY columns.name, column_store_segments.segment_id;
The row count returned by the query is equal to the total number of segments in the
index, which is subsequently equal to the count of rowgroups multiplied by the number
of columns in the table. In addition to the row count contained in each segment and
its size on disk, there are details given as to whether the segment has NULLs in it, and
min_data_id/max_data_id, which provide dictionary lookups for values contained in
the segment. Details about dictionaries and how they work will be provided alongside
the discussion of compression in Chapter 5.
39
Chapter 4 Columnstore Index Architecture
40
Chapter 4 Columnstore Index Architecture
Figure 4-5. Flow of data from the delta store into a columnstore index
Note that the delta store is not used to manage all INSERT operations against a
columnstore index. Chapter 8 discusses in detail how bulk load processes are used to
greatly speed up larger INSERT operations.
41
Chapter 4 Columnstore Index Architecture
the columnstore index. The biggest upside of having a delete bitmap is that DELETE
operations can execute exceptionally fast as there is no need to decompress, update,
and recompress segments in the index. The downside is that deleted rows still take up
space in the index and are not immediately removed. Over time, the space consumed by
deleted rows may become nontrivial, at which point index maintenance (discussed in
Chapter 14) can periodically be used to reclaim this space.
Figure 4-6 adds the delete bitmap into the architecture diagram of columnstore indexes.
Figure 4-6. Adding the delete bitmap into the columnstore index architecture
Both the delete bitmap and the delta store are components of a columnstore index
that only exist when needed. If no deleted rows exist, then there will be no delete bitmap
to overlay onto the compressed rowgroups. Similarly, an empty delta store will not
need to be checked when queries are executed against the columnstore index. These
components exist when necessary, but otherwise will have no impact on performance
when not needed.
42
Chapter 4 Columnstore Index Architecture
Note the following brief definitions for the usage of hot/warm/cold data:
• Hot Data: Real time and actively used by a system with regular reads,
writes, and high concurrency. Availability is expected to be high and
latency should be very low.
Organizations will label data in a range from hot to cold, and the specifics will
vary depending on how their data is used. Figure 4-7 shows the interaction between a
clustered rowstore index and a filtered nonclustered columnstore index.
All data in SQL Server that is not stored in memory-optimized structures is stored on
8 kilobyte pages. These pages reside on physical storage and are read into memory when
needed as-is. If a page is compressed, it remains compressed until its data is needed.
Whether a page is in a rowstore or columnstore index, it is read into memory in the same
fashion. The delete bitmap and delta store are also maintained on pages and read into
memory as needed.
Figure 4-8 illustrates the basic structure of a page in SQL Server. The primary
components of pages are the
• Page header
• Data
• Row offsets
The page header contains basic information about the page such as the object that
owns it, the type of data stored in it, and the amount of unallocated space available for
additional data to be written.
The data rows are the actual data stored on the page. This may be the physical data
in a table (the clustered index or heap), index entries, or a variety of other contents that
are available under a variety of circumstances. For this discussion of columnstore and
rowstore indexes, the data and index data are all that is of immediate concern.
Row offsets store the position that each row starts, allowing SQL Server to locate the
data for any given row once a page is read into memory.
When data is stored in clustered indexes on a rowstore table, data is written to pages
row by row. That is, each column is written sequentially on the page for each row one
after another, as seen in Figure 4-9.
44
Chapter 4 Columnstore Index Architecture
This table has six columns, and they are written sequentially for each row, one after
another. SQL Server will continue to write rows sequentially for this table to the same
page until it runs out of space, at which point a new page will be allocated to the table
and the data will continue there.
This structure is optimized for reading or writing small range lookups that consist
of a set of sequential rows. For example, a query that returned some (or all) of columns
from rows two through five would only need to read the single page shown in Figure 4-9.
If additional rows were required that were on other pages, then those pages would also
be read into memory to satisfy the query.
A clustered or nonclustered rowstore index writes its binary tree structure to pages
as a sequence of pointers. The root level is written first, followed by intermediate levels
that ultimately point to the underlying data at the leaf level of the index. Each level
of the index includes the clustered index keys, which are used to organize and link
between levels of the index. This structure is also optimized for seeking single or small
contiguous ranges of rows. For queries like this, less intermediate levels of an index need
to be read, reducing the number of pages needed to satisfy a query. Figure 4-10 shows a
visualization of a clustered rowstore (binary tree) index.
45
Chapter 4 Columnstore Index Architecture
Because this is the clustered index, the leaf levels contain the column data for each
row that is referenced by the clustered index. In a nonclustered index, the lowest level of
the index would contain pointers to the target data, to be used for key lookups. Similarly,
the leaf level of a nonclustered index would also contain any included columns that are
defined on the index.
A query that needed to read clustered index ID values between 25 and 75 could do
so by reading the index root page first and then pages 1 and 2. A query reading only
clustered index ID value 136 would read the index root and page 4.
In a columnstore index, there is no binary tree structure. A clustered columnstore
index is written as a sequence of compressed segments on as many pages as is needed
to support them. Figure 4-11 illustrates how this would look for a single page of a
columnstore index.
46
Chapter 4 Columnstore Index Architecture
Note that this data, a subset of a compressed segment, contains sequential values for
the same column. This sequence would continue until the end of the rowgroup, at which
point a new segment would begin, or if it were the last of the rowgroups, the next column
would begin.
A query that aggregated values for the first 40 rows of column 1 could retrieve data
very efficiently and do so by simply reading this single page. In a rowstore index for
the same table, it would be necessary to scan data for all columns in those 40 rows and
read those pages into memory. For a table with 20 columns, that would require 20 times
the effort.
Summarizing Differences
The key to understanding the difference between the architecture of rowstore indexes
and columnstore indexes is to consider the logical storage of data vs. the physical storage
of data.
Rowstore indexes store data physically in order by row and then column. Data is
logically accessible as it is ordered by row. Therefore, this architecture is optimized for
queries that seek a sequence or limited subset of rows. Binary tree indexes provide a
speedy way to search for that data based on other columns that the clustered index does
not order the data by.
47
Chapter 4 Columnstore Index Architecture
Columnstore indexes store data physically grouped by column, but ordered by row.
This convention allows for filters to limit the number of rows returned and column
groupings to reduce the pages read as the number of columns needed for a query
decreases. This allows queries to read significantly more rows, but maintain excellent
performance when the column list is limited. Chapter 10 explores how the order of data
within a columnstore index can be controlled to add an additional dimension of filtering
so that data can be sliced both horizontally and vertically.
48
CHAPTER 5
Columnstore
Compression
There are many features that allow columnstore indexes to perform exceptionally well
for analytic workloads. Of those features, compression is the most significant driver
in both performance and resource consumption. Understanding how SQL Server
implements compression in columnstore indexes and how different algorithms are used
to shrink the size of this data allows for optimal architecture and implementation of
analytic data storage in SQL Server.
• Entity types
• Money
• Dates
• Counts
• Repetitive dimensions
49
© Edward Pollack 2022
E. Pollack, Analytics Optimization with Columnstore Indexes in Microsoft SQL Server,
https://doi.org/10.1007/978-1-4842-8048-5_5
Chapter 5 Columnstore Compression
Eleven rows are represented, of which each column contains repeated values. The
Date column contains only three distinct values, whereas Status contains four and Hours
contains six. This repetition tends to scale well. If this were the data from a rowgroup
containing one million rows, it would be reasonable to expect a column like Status to
only contain a small number of values that are repeated frequently throughout the data
set. The more repetitive the data is, the more effectively it will compress.
As is expected, more values are repeated within the values present in a single column
than across different columns. This is true for two primary reasons:
• Different data types will have vastly different values. For example,
the contents of a string column are not easily comparable to that of a
money column.
Columnstore indexes align perfectly with these conventions, whereas rowstore indexes
struggle because row and page compression are forced to take into account all columns in a
row or page, which will comprise a wide variety of data types and dimensions.
Note that the delta store is not columnstore compressed. This data structure needs to be
written to as quickly as possible at runtime, and therefore forgoing expensive compression
removes the time and overhead required to process writes against it. This will mean that
the delta store takes up more space than it may have if columnstore compression were
used. Since the delta store tends to be small in comparison to the compressed columnstore
index contents and contain a predictable count of rows, the effort needed to maintain it is
constant and does not grow over time. Depending on SQL Server version, delta rowgroups
may use row or page compression to limit their storage and memory footprint.
50
Chapter 5 Columnstore Compression
Encoding is an important step in the compression process. The following are details
that explain different encoding algorithms, how they are used, and how they impact
overall compression effectiveness.
V
alue Encoding
This process seeks to reduce the footprint of numeric columns by adjusting their storage
to be able to use smaller sized objects. For integer data, which consists of TINYINT,
SMALLINT, INT, and BIGINT, values are compressed by a variety of mathematical
transformations that can divide all values by a common divisor or subtract all values
with a common subtrahend. Consider a column that is a 4-bit integer data type and
contains the values shown in Table 5-1.
60 6
150 8
90 7
9000 13
630 10
300000 19
51
Chapter 5 Columnstore Compression
60 6 3
150 15 4
90 9 4
9000 900 10
630 63 6
300000 30000 15
The next compression step is to take the smallest value in the list as the base, zero it
out, and reference other values by subtracting that base from them, as seen in
Table 5-3.
60 6 0 0
150 15 9 4
90 9 3 2
9000 900 894 10
630 63 57 6
300000 30000 29994 15
Reviewing the minimum space required to store the values, the original size was
63 bits. After the first transformation, the size is reduced to 42 bits, and after the last
transformation, it is reduced to 37 bits. Overall, the reduction in size for this set of values
52
Chapter 5 Columnstore Compression
was 38%. The level of value compression seen will vary based on the set of values and
can be significantly higher or lower. For example, a set of values that are all the same will
compress into a single zeroed-out reference, as seen in Table 5-4.
1700000 0 22 0
1700000 0 22 0
1700000 0 22 0
1700000 0 22 0
1700000 0 22 0
While this example may seem extreme, it is not that unusual as dimensions often
contain repeated values that fall into limited ranges.
Note that real-world transformations of data that occur in columnstore indexes
are more complex than this, but the process to determine their details is essentially
the same.
Decimals are converted into integers prior to compression, at which point they
follow the same value encoding rules as integers would. For example, Table 5-5 shows
the compression process for a set of tax rates that are stored in decimal format.
Values Value * 100 Value / 5 Value / 5 - 70 Original Size in Bits Reduced Size in Bits
53
Chapter 5 Columnstore Compression
The original data is stored as DECIMAL(4,2), which consumes 5 bytes (40 bits) per
value. By converting to integers prior to compression, multiple transformations are
applied that reduce storage from 200 bits to 26 bits. Decimals may not seem the most
likely candidates for effective compression, but once they are converted into integer
values, they can take advantage of any of the transformations that can be applied to
integers.
D
ictionary Encoding
String data is stored more efficiently using dictionary encoding. In this algorithm,
each distinct string value in a segment is inserted into a dictionary and indexed using
an integer pointer to the dictionary. This provides significant space savings for string
columns that contain repeated values. Consider a segment that contains foods of the
data type VARCHAR(50), as seen in Table 5-6.
Taco 4
Hamburger 9
Fish Fry 8
Taco 4
Hamburger 9
Taco 4
Triple-Layer Chocolate Cake 27
Triple-Layer Chocolate Cake 27
Triple-Layer Chocolate Cake 27
Cup of Coffee 13
The first step to compressing this data is to create an indexed dictionary, as seen in
Table 5-7.
54
Chapter 5 Columnstore Compression
0 Fish Fry
1 Hamburger
2 Taco
3 Triple-Layer Chocolate Cake
4 Cup of Coffee
Since the segment contains five distinct values, the index ID for the dictionary will
consume 3 bits. The resulting mapping in Figure 5-2 shows the resulting dictionary
compression of this data.
The original data required 132 bytes in storage, whereas the encoded data required
only 30 bits (3 bits per value), plus the space required to store the lookup values in the
dictionary. The dictionary itself consumes space for each distinct value in the segment
data, but as a bonus, a dictionary may be shared by multiple rowgroups. Therefore, a
table with 100 rowgroups may reuse dictionaries for each column across all of those
rowgroups.
The key to dictionary compression is cardinality. The less distinct values present
in the column, the more efficiently the data will compress. A CHAR(100) column with
5 distinct values will compress significantly better than a CHAR(10) column with 1000
distinct values. The dictionary size can be roughly estimated as the product of distinct
55
Chapter 5 Columnstore Compression
values and their initial size. Lower cardinality means that there are less string values to
store in the dictionary and the index used to reference the dictionary will be smaller.
Since the dictionary index IDs are repeated throughout rows in the table, a smaller size
can greatly improve compression.
Dictionaries can be viewed in SQL Server using the view sys.column_store_
dictionaries. The query in Listing 5-1 returns information about the dictionaries used in
the fact.Sale_CCI table created in Chapter 4.
SELECT
partitions.partition_number,
objects.name AS table_name,
columns.name AS column_name,
CASE
WHEN column_store_dictionaries.dictionary_id = 0 THEN 'Global
Dictionary'
ELSE 'Local Dictionary'
END AS dictionary_scope,
CASE WHEN column_store_dictionaries.type = 1 THEN 'Hash dictionary
containing int values'
WHEN column_store_dictionaries.type = 2 THEN 'Not used' --
Included for completeness
WHEN column_store_dictionaries.type = 3 THEN 'Hash
dictionary containing string values'
WHEN column_store_dictionaries.type = 4 THEN 'Hash
dictionary containing float values'
END AS dictionary_type,
column_store_dictionaries.entry_count,
column_store_dictionaries.on_disk_size
FROM sys.column_store_dictionaries
INNER JOIN sys.partitions
ON column_store_dictionaries.hobt_id = partitions.hobt_id
INNER JOIN sys.objects
ON objects.object_id = partitions.object_id
INNER JOIN sys.columns
56
Chapter 5 Columnstore Compression
ON columns.column_id = column_store_dictionaries.column_id
AND columns.object_id = objects.object_id
WHERE objects.name = 'Sale_CCI';
This view provides the cardinality of the column in entry_count, as well as the size of
the dictionary and its type. There are a number of noteworthy takeaways from this data:
57
Chapter 5 Columnstore Compression
This metadata provides detail about each dictionary in the columnstore index,
but does not directly relate segments to dictionaries. The query in Listing 5-2 links
sys.column_store_segments to sys.column_store_dictionaries to provide this added
information.
SELECT
column_store_segments.segment_id,
types.name AS data_type,
types.max_length,
types.precision,
types.scale,
CASE
WHEN PRIMARY_DICTIONARY.dictionary_id IS NOT NULL THEN 1
ELSE 0
END AS does_global_dictionary_exist,
PRIMARY_DICTIONARY.entry_count AS global_dictionary_entry_count,
PRIMARY_DICTIONARY.on_disk_size AS global_dictionary_on_disk_size,
CASE
WHEN SECONDARY_DICTIONARY.dictionary_id IS NOT NULL THEN 1
ELSE 0
END AS does_local_dictionary_exist,
SECONDARY_DICTIONARY.entry_count AS local_dictionary_entry_count,
SECONDARY_DICTIONARY.on_disk_size AS local_dictionary_on_disk_size
FROM sys.column_store_segments
INNER JOIN sys.partitions
ON column_store_segments.hobt_id = partitions.hobt_id
INNER JOIN sys.objects
ON objects.object_id = partitions.object_id
INNER JOIN sys.columns
ON columns.object_id = objects.object_id
AND column_store_segments.column_id = columns.column_id
INNER JOIN sys.types
ON types.user_type_id = columns.user_type_id
LEFT JOIN sys.column_store_dictionaries PRIMARY_DICTIONARY
58
Chapter 5 Columnstore Compression
ON column_store_segments.primary_dictionary_id =
PRIMARY_DICTIONARY.dictionary_id
AND column_store_segments.primary_dictionary_id <> -1
AND PRIMARY_DICTIONARY.column_id = columns.column_id
AND PRIMARY_DICTIONARY.hobt_id = partitions.hobt_id
LEFT JOIN sys.column_store_dictionaries SECONDARY_DICTIONARY
ON column_store_segments.secondary_dictionary_id =
SECONDARY_DICTIONARY.dictionary_id
AND column_store_segments.secondary_dictionary_id <> -1
AND SECONDARY_DICTIONARY.column_id = columns.column_id
AND SECONDARY_DICTIONARY.hobt_id = partitions.hobt_id
WHERE objects.name = 'Sale_CCI'
AND columns.name = 'Bill To Customer Key';
Note that if a table is partitioned or has many columns, the result set could become
quite large, so this query is filtered by both table and column to focus on a single set of
segments, for demonstration purposes. The results help to fill in the blanks for dictionary
metadata, as seen in Figure 5-4.
Figure 5-4. Dictionary metadata detail for Sale_CCI. [Bill To Customer Key]
This detail differentiates between local and global dictionaries, as well as adds in
some column-level detail. Based on the results, the integer column [Bill To Customer
Key] compresses quite well as it contains only three distinct values in its dictionary.
Sixty-eight bytes for a dictionary is tiny and will provide immense savings for a column
that was previously 4 bytes per row across 25 million rows. Each value is reduced from 4
bytes to 2 bits, allowing it to be compressed at a ratio of 16:1!
It is important to note that a dictionary is limited in size to 16 megabytes. Since a
segment may only have one global and one local dictionary assigned to it, if a dictionary
reaches the 16MB threshold, then a rowgroup will be split into smaller rowgroups to
allow for each to possess dictionaries that are below this limit. If inserting new rows into
59
Chapter 5 Columnstore Compression
a table would cause any of its dictionaries to exceed the dictionary size cap, then a new
rowgroup will be created to accommodate new rows. There are two key takeaways from
this information:
Numeric columns with high cardinality will typically use value compression to avoid
this problem, whereas string columns that are long and/or have a high cardinality may
not be able to avoid this issue. Sixteen megabytes may sound small when compared to
massive OLAP tables, but dictionaries themselves are compressed and typical analytic
data will not approach the dictionary size limit. Testing should always be used to confirm
either of these scenarios, rather than assuming they will or will not happen.
60
Chapter 5 Columnstore Compression
When populated with data, the Dimension.Sale_Description table is loaded first with
any new distinct Description values, and then Fact.Sale_CCI_Normalized is loaded with
data that joins the dimension table to retrieve values for Description_Key. Figure 5-5
shows the side-by-side comparison of the storage of these tables.
61
Chapter 5 Columnstore Compression
The table on the left is the original table, whereas the table on the right has
the Description column normalized into a SMALLINT. In this specific example,
normalization allowed for a significant space savings, about 15:1 compression savings
from the original table!
Generally speaking, normalization will help most when the number of distinct values
is low and the length of the columns is high, as dictionary encoding for a SMALLINT
column will produce superior compression than dictionary encoding for a large
VARCHAR column.
The downside to normalization is that data load processes will take longer. Joining
additional dimension columns at runtime to load data requires time and resources. The
decision as to whether to normalize or not should be carefully considered and testing
conducted to determine if sufficient gains are realized to justify the change.
Ultimately, business logic and organizational needs will drive this decision at a high
level, whereas technical considerations will allow for further tuning as needed. There is
no right or wrong answer to this question; therefore, consider either approach valid until
hands-on testing proves otherwise.
62
Chapter 5 Columnstore Compression
1. When the tuple mover merges rows from the delta store into a
clustered columnstore index that has at least one nonclustered
rowstore index
This does not mean that nonclustered rowstore indexes should never be used on
clustered columnstore indexes. Nor should memory-optimized columnstore indexes
be avoided. Instead, the fact that Vertipaq compression will not operate on these
structures should be an additional input into architectural decision-making process. If a
nonclustered rowstore index is required to service a specific set of critical queries, then
it should be used. If its use is optional, then consider the negative impact it may have on
compression.
While the details of how the algorithm works for a given columnstore index are not
exposed through any convenient views, it is possible to check whether or not a rowgroup
has Vertipaq optimization with the query in Listing 5-4.
63
Chapter 5 Columnstore Compression
Listing 5-4. Query That Returns Whether or Not Vertipaq Optimization Is Used
SELECT
objects.name,
partitions.partition_number,
dm_db_column_store_row_group_physical_stats.row_group_id,
dm_db_column_store_row_group_physical_stats.has_vertipaq_
optimization
FROM sys.dm_db_column_store_row_group_physical_stats
INNER JOIN sys.objects
ON objects.object_id =
dm_db_column_store_row_group_physical_stats.object_id
INNER JOIN sys.partitions
ON partitions.object_id = objects.object_id
AND partitions.partition_number =
dm_db_column_store_row_group_physical_stats.partition_number
WHERE objects.name = 'Sale_CCI'
ORDER BY dm_db_column_store_row_group_physical_stats.row_group_id;
64
Chapter 5 Columnstore Compression
65
Chapter 5 Columnstore Compression
Note that the encoded values are now reordered so that each value is grouped next to
the same value. Run-length encoding groups together each like value logically to reduce
the overall amount of storage required for the segment. Logically, the resulting structure
can be illustrated as shown in Figure 5-8.
Each value has been provided with a count, shown in parenthesis after it. There is a
single instance of the value 0, followed by two instances of 1, three instances of 2, three
instances of 3, and one instance of 4. This algorithm thrives on data that is repetitive and
works exceptionally well in conjunction with dictionary encoding and Vertipaq compression.
Bit array compression is used when a segment contains a small number of distinct
values and the data cannot benefit significantly from Vertipaq compression and run-length
encoding. A bit array is an array in which each distinct value is assigned a column and each
66
Chapter 5 Columnstore Compression
row is mapped to that column using bits. Figure 5-9 illustrates how this compression would
look when used on an unencoded and unsorted data set.
While this may seem like a complicated transformation, the resulting ones and zeros
are quite compact and provide a structure that can be quickly decompressed.
There are a handful of other more complex compression algorithms that may be
used to progressively shrink the size of each segment of data. The algorithms used can
vary by segment and may be used often, sometimes, or not at all. When this process
is complete, the final level of compression is applied by SQL Server, which utilized its
xVelocity compression algorithm.
Columnstore segments are stored as Large Objects (LOB) in SQL Server. The details
of xVelocity compression are not public, and therefore we cannot delve further into how
they work. While the transformations used to convert the structures discussed thus far
in this chapter into their final form are not fully known, we can infer their effectiveness
by turning on STATISTICS IO and viewing the reads against a columnstore index when a
query is issued. Consider the simple analytic query in Listing 5-5.
SELECT
SUM(Quantity)
FROM fact.Sale_CCI
WHERE [Invoice Date Key] >= '1/1/2016'
AND [Invoice Date Key] < '2/1/2016'
67
Chapter 5 Columnstore Compression
Clicking the Messages tab as shown in the results provides additional information on
the reads and writes performed by the query, as seen in Figure 5-11.
There is quite a bit of information returned, but for the moment, the LOB logical
reads circled in red provide an indication of the reads required to satisfy this query.
Columnstore indexes will not report reads as logical reads, but instead as LOB logical
reads, which indicates that their segments are stored as Large Objects and not as
traditional SQL Server data structures. If this number of reads seems high, that is
because it is! Later chapters in this book will provide additional optimizations that will
reduce storage and memory footprint, as well as greatly improve query performance.
68
Chapter 5 Columnstore Compression
computing resources to read and write the data. As a result, this option should be
reserved for data that is not frequently used, such as
69
Chapter 5 Columnstore Compression
Once populated with data, this table can be compared to the previous version that
used standard columnstore compression, as seen in Figure 5-12.
70
Chapter 5 Columnstore Compression
71
CHAPTER 6
Columnstore Metadata
Each compressed segment within a columnstore index not only stores analytic data,
but through metadata can describe its contents with more precision than rowstore
tables can.
This metadata resides in system views and allows SQL Server to make intelligent
query processing decisions at runtime. This chapter dives into this metadata in
detail, how SQL Server uses it, and how it can be used to improve the performance of
columnstore indexes.
R
owgroup Metadata
Each compressed rowgroup contains up to 220 rows whose contents are typically static
outside of the influence of index maintenance. Because of this, the row counts, deleted
row counts, size, and other details are accurate and can be used to understand the size
and shape of the underlying data.
The query in Listing 6-1 returns the table and index name, as well as all columns
from the view sys.column_store_row_groups.
73
© Edward Pollack 2022
E. Pollack, Analytics Optimization with Columnstore Indexes in Microsoft SQL Server,
https://doi.org/10.1007/978-1-4842-8048-5_6
Chapter 6 Columnstore Metadata
Listing 6-1. Query to Return Metadata for All Rowgroups in a Columnstore Index
SELECT
tables.name AS table_name,
indexes.name AS index_name,
column_store_row_groups.*
FROM sys.column_store_row_groups
INNER JOIN sys.indexes
ON indexes.index_id = column_store_row_groups.index_id
AND indexes.object_id = column_store_row_groups.object_id
INNER JOIN sys.tables
ON tables.object_id = indexes.object_id
WHERE tables.name = 'Sale_CCI'
ORDER BY tables.object_id, indexes.index_id,
column_store_row_groups.row_group_id;
P
artition_number
A rowgroup cannot span multiple partitions; therefore, a partitioned table will have
separate sets of rowgroups in each partition, without overlap between them. This
column provides the partition that a given rowgroup resides on.
74
Chapter 6 Columnstore Metadata
Delta_store_hobt_id
For open rowgroups in the delta store, an ID is provided here that links to
sys.internal_partitions and represents the delta rowgroup data it contains. A standard
compressed rowgroup will contain NULL for this column.
Total_rows
This provides the row count for the rowgroup. Summing this value across all rowgroups
allows for an accurate row count to be collected for the table without querying it directly.
If many rowgroups have low row counts, it is indicative of a process problem worthy of
investigation.
Deleted_rows
If any rows are flagged as deleted in the delete bitmap for this rowgroup, the count of
deleted rows is provided here.
Size_in_bytes
The total size of the rowgroup, which is the sum space used by all segments residing
in that rowgroup. This can be used to quickly determine the size of all or part of a
columnstore index.
Segment Metadata
The basic unit of storage in a columnstore index is the segment. Each segment
represents a single column and its contents for the set of rows contained in each
rowgroup. The number of segments that comprise a columnstore index is the product of
the number of rowgroups in the index and columns in the table.
75
Chapter 6 Columnstore Metadata
SELECT
tables.name AS table_name,
indexes.name AS index_name,
columns.name AS column_name,
partitions.partition_number,
column_store_segments.encoding_type,
column_store_segments.row_count,
column_store_segments.has_nulls,
column_store_segments.base_id,
column_store_segments.magnitude,
column_store_segments.min_data_id,
column_store_segments.max_data_id,
column_store_segments.null_value,
column_store_segments.on_disk_size
FROM sys.column_store_segments
INNER JOIN sys.partitions
ON column_store_segments.hobt_id = partitions.hobt_id
INNER JOIN sys.indexes
ON indexes.index_id = partitions.index_id
AND indexes.object_id = partitions.object_id
INNER JOIN sys.tables
ON tables.object_id = indexes.object_id
INNER JOIN sys.columns
76
Chapter 6 Columnstore Metadata
ON tables.object_id = columns.object_id
AND column_store_segments.column_id = columns.column_id
WHERE tables.name = 'Sale_CCI'
ORDER BY columns.name, column_store_segments.segment_id;
Details about columnstore segments delve into how they are compressed and stored,
which can provide valuable clues to the efficiency of columnstore compression for
each column within a columnstore index. The following is detail about some of the key
columns in this view.
E ncoding_type
This indicates if this segment is encoded using a dictionary and the type of data stored in
it, as follows:
Typically, segments with many distinct values (aka high cardinality) will tend to use
value-based encoding and not implement a dictionary lookup, whereas segments with
low cardinality will more often use a dictionary to reduce the storage of repeated values.
This decision is made internally by SQL Server to improve compression and reduce the
space consumed by a segment.
77
Chapter 6 Columnstore Metadata
R
ow_Count
This row count matches the row count of the corresponding rowgroup and represents
the number of values (repeated or distinct) contained within a columnstore segment.
H
as_nulls
If a segment contains at least one NULL, then this is set to 1; otherwise, it is 0. Note that
this column does not indicate if a column allows NULL, but instead reports on whether a
segment happens to have NULL within its set of values. Therefore, different segments for
the same column can have different values for this bit.
B
ase_id and Magnitude
These columns report directly on value-based encoding. If values were modified using a
base and exponent to reduce their storage size, that detail is represented here. The base
and magnitude can vary segment to segment and do not need to be the same for each
segment for any one column.
M
in_data_id and Max_data_id
This incredibly useful pair of columns provides the minimum and maximum values for a
segment, or the dictionary lookup values, if a dictionary is used. If transformations have
been made for value-based encoding, then the values provided by these columns will
include those modifications.
The minimum and maximum values for a segment are used by the query optimizer
to skip unnecessary rowgroups when reading a columnstore index. For example, if a
segment contains a minimum value of 1 and a maximum value of 400, then queries
that filter for values outside of this range can skip this rowgroup altogether. This
optimization is known as segment elimination and is key to the performance of large
columnstore indexes. Chapter 10 discusses segment elimination in detail and provides
the conventions needed to take full advantage of this feature.
N
ull_value
If a segment contains NULLs, then the value provided by this column is the numeric
representation for NULL within the encoding of the column.
78
Chapter 6 Columnstore Metadata
O
n_disk_size
The space consumed by the columnstore segment is provided by the column
on_disk_size, allowing for the space consumed by a columnstore index to be easily
broken down into granular details. If one segment out of many has an unusually large
size, there can be value in exploring why and determining if further optimizations can be
made to improve data compression.
This granular detail also allows for the space consumed by each column to be
calculated, as well as space consumed per column per partition. If a columnstore index
is growing large unusually quickly, this metadata allows for the source of growth to be
isolated to a column and set of segments to determine the source of that growth.
SELECT
objects.name AS table_name,
indexes.name AS index_name,
dm_db_column_store_row_group_physical_stats.partition_number,
dm_db_column_store_row_group_physical_stats.row_group_id,
dm_db_column_store_row_group_physical_stats.state_desc,
dm_db_column_store_row_group_physical_stats.total_rows,
dm_db_column_store_row_group_physical_stats.deleted_rows,
dm_db_column_store_row_group_physical_stats.size_in_bytes,
dm_db_column_store_row_group_physical_stats.trim_reason_desc,
dm_db_column_store_row_group_physical_stats.transition_to_
compressed_state_desc,
79
Chapter 6 Columnstore Metadata
dm_db_column_store_row_group_physical_stats.has_vertipaq_
optimization,
dm_db_column_store_row_group_physical_stats.created_time
FROM sys.dm_db_column_store_row_group_physical_stats
INNER JOIN sys.objects
ON objects.object_id = dm_db_column_store_row_group_physical_stats
.object_id
INNER JOIN sys.indexes
ON indexes.object_id = dm_db_column_store_row_group_physical_stats
.object_id
AND indexes.index_id = dm_db_column_store_row_group_physical_stats.index_id
WHERE objects.name = 'Sale_CCI';
The results provide additional information about rowgroups, as seen in Figure 6-3.
Figure 6-3. Rowgroup physical stats for the columnstore index on Sale_CCI
This information provides clues as to how and when rowgroups were built. The
following are descriptions of the key columns presented in Figure 6-3.
S
tate_desc
This is the current state of the rowgroup, which is identical to the value found in
sys.column_store_row_groups. A rowgroup with a state of TOMBSTONE indicates
that it previously was part of a delta store and has been transitioned into compressed
rowgroups, but has yet to be cleaned up. This cleanup occurs asynchronously and
automatically, without the need for operator intervention.
80
Chapter 6 Columnstore Metadata
Trim_reason_desc
A rowgroup may contain up to 220 rows. There are many reasons why a rowgroup is
created and compressed with less than its full complement of rows. If this column shows
“NO_TRIM” for a rowgroup, then that indicates that it is a full rowgroup with 1,048,576
rows. If the rowgroup contains less than its maximum possible size in rows, then the trim
reason will explain why.
Typically, a columnstore index should mostly contain rowgroups with row counts
near 1,048,576. In the event that this is not the case, and a majority of rowgroups are
undersized, then understanding why is key to improving columnstore storage and
performance. There is no firm threshold for defining what “undersized” is, but typically
an undersized rowgroup contains less than 900,000 rows, or less than about 10% of its
maximum size.
Common reasons for a rowgroup to be trimmed include
• AUTO_MERGE: This results when the tuple mover runs and merges
multiple rowgroups together. This is a good thing as it contributes to
larger rowgroups and the removal of undersized ones.
Transition_to_compressed_state_desc
This column describes how a rowgroup was moved from the delta store into a set of
compressed segments within a columnstore index. It provides additional information
for how the rowgroup was created and can be used in conjunction with the trim reason
to understand how all rowgroups were created, whether trimmed or not. The possible
values for the transition to compressed state description include
82
Chapter 6 Columnstore Metadata
Note that with the exception of the two transition reasons related to index reorganize
operations, the remaining reasons indicate the creation of compressed rowgroups
through automatic processes in SQL Server. These reasons should be seen as part
of normal operations and only investigated if an unsolved performance problem is
encountered.
Has_vertipaq_optimization
This indicates that the rows within a rowgroup could be reordered via Vertipaq
optimization to improve the compression ratio for this rowgroup.
If zero is indicated by this column, it is worth investigation. The space savings
associated with Vertipaq optimization is nontrivial and will result in faster queries
and less memory and storage consumption by a columnstore index. As discussed in
the previous chapter, the two reasons why Vertipaq optimization may not occur are as
follows:
83
Chapter 6 Columnstore Metadata
C
reated_time
This indicates the time that a rowgroup was created. This can help in understanding
how many rowgroups are created over time and the number of rows being added to the
columnstore index per unit time. In addition, created_time can be correlated to
size_in_bytes, allowing the increase in size of the columnstore index to be measured over
time in bytes.
SELECT
objects.name AS table_name,
indexes.name AS index_name,
dm_db_column_store_row_group_operational_stats.row_group_id,
dm_db_column_store_row_group_operational_stats.index_scan_count,
dm_db_column_store_row_group_operational_stats.scan_count,
dm_db_column_store_row_group_operational_stats.delete_buffer_
scan_count,
dm_db_column_store_row_group_operational_stats.row_group_lock_count,
dm_db_column_store_row_group_operational_stats.row_group_lock_
wait_count,
dm_db_column_store_row_group_operational_stats.row_group_lock_
wait_in_ms,
dm_db_column_store_row_group_operational_stats.returned_row_count
FROM sys.dm_db_column_store_row_group_operational_stats
INNER JOIN sys.objects
ON objects.object_id = dm_db_column_store_row_group_operational_stats
.object_id
84
Chapter 6 Columnstore Metadata
The results in Figure 6-4 show individualized operational data for each rowgroup.
Figure 6-4. Rowgroup operational stats for the columnstore index on Sale_CCI
All metrics in this view are cumulative since the last restart of the SQL Server service.
Therefore, to meaningfully use this data, it must be captured periodically to calculate the
difference in measurements from one sample to the next.
The following is a subset of columns that can be used to track detailed columnstore
rowgroup usage.
I ndex_scan_count
This counts the number of times the columnstore index was scanned, regardless of
which rowgroups were requested. This value will be identical for all rowgroups in a given
partition.
S
can_count
This only counts the number of times this rowgroup was scanned. Comparing this to
index_scan_count allows for an understanding of how often a particular rowgroup is
needed to satisfy queries against the columnstore index.
In typical OLAP data, older data will be required less often than newer data, and
therefore rowgroups containing older data will see less scans than newer rowgroups.
The scan count columns in this view can provide the necessary data to support storage
85
Chapter 6 Columnstore Metadata
changes for cold or warm data. If older data is needed for posterity but rarely queried,
it can likely be off-loaded to slower and cheaper storage. Similarly, if there is a portion of
data that is newer and critically important, it may benefit from faster storage.
Delete_buffer_scan_count
This counts the number of times that the delete bitmap was consulted in order to
complete a query against this rowgroup. The delete buffer refers to both the delete
bitmap as it is stored alongside the columnstore index as a b-tree, as well as the in-
memory hash-table representation of that index.
Typically, deleted rows in a columnstore index are not problematic until their
quantity become large when compared to the size of the index. If delete_buffer_scan_
count is a larger number that is close in value to scan_count, then an index rebuild
may be worthwhile to remove the deleted rows from the index. Check the value for
deleted_rows in dm_db_column_store_row_group_physical_stats first to validate that
the number of deleted rows is indeed large before proceeding with index maintenance.
While an index rebuild can be executed as an online operation, it is still computationally
expensive and can be time-consuming for a large columnstore index.
Row_group_lock_count, Row_group_lock_wait_count,
and Row_group_lock_wait_in_ms
These provide the count (and details) of lock requests against this rowgroup. These are
typically the result of columnstore index writes colliding with reads. Rowgroup locks are
most frequent against rowgroups that are actively being written, whereas locks against
older rowgroups are uncommon.
Rowgroup locking is not inherently a bad thing, but if severe contention is occurring
between data loads and analytic processes, then understanding which rowgroups are
causing contention can assist in troubleshooting that situation.
Update operations will be the most likely to cause contention as they require writing
both to the delete bitmap and to the delta store. While updating a single row would
impact two rowgroups (one for the delete bitmap and one for the delta store), a larger
update might impact many rowgroups. In these scenarios, the locking caused by the
update can be quite disruptive.
86
Chapter 6 Columnstore Metadata
Similarly, if data is trickled into a columnstore index throughout the day, there is
more opportunity for contention than if it is managed via a single centralized process
less frequently and during a nonpeak time for analytics.
As is the case for many performance challenges, investigate contention when
needed. If row_group_lock_count appears high, but analytics speeds (and end-user
happiness) are good, then that fact is best saved for posterity, but not acted on yet.
Always cross-reference Row_group_lock_wait_count and Row_group_lock_wait_in_ms to
determine if locking resulted in waits or not. If the count of waits or the wait time is not
high, then further action is likely not needed. If unsure of whether or not the aggregate
wait time is high or not, consider the lock wait time per lock incident, as given by the
query in Listing 6-5.
Listing 6-5. Formula That Calculates the Lock Time per Lock Incidence
SELECT
objects.name AS table_name,
indexes.name AS index_name,
dm_db_column_store_row_group_operational_stats.row_group_id,
dm_db_column_store_row_group_operational_stats.scan_count,
dm_db_column_store_row_group_operational_stats.row_group_lock_
wait_count,
dm_db_column_store_row_group_operational_stats.row_group_lock_
wait_in_ms,
CASE
WHEN dm_db_column_store_row_group_operational_stats.row_
group_lock_wait_count = 0 THEN 0
ELSE CAST(CAST(dm_db_column_store_row_group_operational_
stats.row_group_lock_wait_in_ms AS DECIMAL(16,2)) /
CAST(dm_db_column_store_row_group_operational_stats
.row_group_lock_wait_count AS DECIMAL(16,2))
AS DECIMAL(16,2))
END AS lock_wait_ms_per_wait_incidence
FROM sys.dm_db_column_store_row_group_operational_stats
INNER JOIN sys.objects
ON objects.object_id = dm_db_column_store_row_group_operational_stats
.object_id
87
Chapter 6 Columnstore Metadata
R
eturned_row_count
One additional metric provided that can assist in quantifying utilization is the overall
count of rows returned. This provides an additional dimension as to how much data is
read from this rowgroup over a period of time and can be compared to other rowgroups
in the index. Note that a lower row count may be the result of the rowgroup having less
rows and not simply less usage.
88
Chapter 6 Columnstore Metadata
SELECT
databases.name,
objects.name,
indexes.name,
columns.name,
dm_column_store_object_pool.row_group_id,
dm_column_store_object_pool.object_type_desc,
dm_column_store_object_pool.access_count,
dm_column_store_object_pool.memory_used_in_bytes,
dm_column_store_object_pool.object_load_time
FROM sys.dm_column_store_object_pool
INNER JOIN sys.objects
ON objects.object_id = dm_column_store_object_pool.object_id
INNER JOIN sys.indexes
ON indexes.object_id = dm_column_store_object_pool.object_id
AND indexes.index_id = dm_column_store_object_pool.index_id
INNER JOIN sys.databases
ON databases.database_id = dm_column_store_object_pool.database_id
LEFT JOIN sys.columns
ON columns.column_id = dm_column_store_object_pool.column_id
AND columns.object_id = dm_column_store_object_pool.object_id
WHERE objects.name = 'Sale_CCI'
AND databases.name = DB_NAME()
ORDER BY dm_column_store_object_pool.row_group_id, columns.name;
89
Chapter 6 Columnstore Metadata
O
bject_type_desc
This is the type of columnstore object residing in memory. It will be one of the
following values:
• COLUMN_SEGMENT: Each row with this type represents a single
columnstore segment. These entries will contain a value for column_
id, allowing for further detail to be returned regarding the column
that the segment belongs to.
90
Chapter 6 Columnstore Metadata
A
ccess_count
This is a count of all read and write operations to this object in memory. This provides a
rough measure of how much use the object has gotten and which objects in memory are
used often vs. those that are used rarely.
Because this view examines objects residing in memory, the data is transient and
maintains history only for as long as an object is in memory. Therefore, access counts are
only cumulative since the time given by object_load_time.
O
bject_load_time
This is the time that this object was read into the object pool in memory. Combined with
access_count, the number of read or write operations per unit time can be calculated,
which may be useful in gauging how much usage particular objects are getting
in memory.
The data provided in dm_column_store_object_pool can also be aggregated so that
overall memory consumption by a given rowgroup, column, index, or object type can
be measured. For example, the query in Listing 6-7 quantifies how much memory is
consumed by each rowgroup.
SELECT
databases.name,
objects.name,
indexes.name,
dm_column_store_object_pool.row_group_id,
SUM(dm_column_store_object_pool.access_count) AS access_count,
SUM(dm_column_store_object_pool.memory_used_in_bytes) AS
memory_used_in_bytes
91
Chapter 6 Columnstore Metadata
FROM sys.dm_column_store_object_pool
INNER JOIN sys.objects
ON objects.object_id = dm_column_store_object_pool.object_id
INNER JOIN sys.indexes
ON indexes.object_id = dm_column_store_object_pool.object_id
AND indexes.index_id = dm_column_store_object_pool.index_id
INNER JOIN sys.databases
ON databases.database_id = dm_column_store_object_pool.database_id
LEFT JOIN sys.columns
ON columns.column_id = dm_column_store_object_pool.column_id
AND columns.object_id = dm_column_store_object_pool.object_id
WHERE objects.name = 'Sale_CCI'
AND databases.name = DB_NAME()
GROUP BY databases.name, objects.name, indexes.name,
dm_column_store_object_pool.row_group_id
ORDER BY dm_column_store_object_pool.row_group_id;
Depending on the index, its metadata, and its usage, some rowgroups may use
significantly more memory than others. In the example results in Figure 6-6, memory
is used relatively equally per rowgroup, indicating that queries are reading data evenly
from each rowgroup.
92
Chapter 6 Columnstore Metadata
SELECT
tables.name AS table_name,
indexes.name AS index_name,
partitions.partition_number,
column_store_row_groups.row_group_id,
column_store_row_groups.state_description,
column_store_row_groups.total_rows,
column_store_row_groups.size_in_bytes,
column_store_row_groups.deleted_rows,
internal_partitions.internal_object_type_desc,
internal_partitions.rows,
internal_partitions.data_compression_desc
FROM sys.column_store_row_groups
INNER JOIN sys.indexes
ON indexes.index_id = column_store_row_groups.index_id
AND indexes.object_id = column_store_row_groups.object_id
INNER JOIN sys.tables
ON tables.object_id = indexes.object_id
INNER JOIN sys.partitions
ON partitions.partition_number = column_store_row_groups.partition_number
AND partitions.index_id = indexes.index_id
AND partitions.object_id = tables.object_id
LEFT JOIN sys.internal_partitions
ON internal_partitions.object_id = tables.object_id
WHERE tables.name = 'Sale_CCI'
ORDER BY indexes.index_id, column_store_row_groups.row_group_id;
93
Chapter 6 Columnstore Metadata
The results of this query provide at least one row per rowgroup per partition,
depending on the contents. Figure 6-7 contains a sample of this output.
The internal columnstore objects shown illustrate the presence of both deleted rows
and delta store contents. The deleted_rows column contains an individual count for each
rowgroup, whereas the rows column contains aggregate details.
The following are short descriptions of each new column in the result set and how it
can be used to track the growth and usage of a columnstore index.
I nternal_object_type_desc
This indicates what type of internal object is referenced, and may be one of the
following values:
94
Chapter 6 Columnstore Metadata
95
CHAPTER 7
Batch Execution
Processing of rows in SQL Server is traditionally managed one row at a time. For
transactional workloads, this is a sensible convention, as row counts for read and write
operations are typically small.
Analytic queries that routinely operate on thousands or millions of rows do not
benefit from reading rows in this fashion. Batch mode execution is a SQL Server
execution mode that allows groups of rows to be read and passed between execution
plan operators, ultimately improving performance.
As a query is executed and each operator is processed, rows are passed between
those operators in quantities determined by the execution mode. In row mode, each
row is passed from operator to operator sequentially. In batch mode, groups of rows are
passed between operators. The result of this convention is that in batch mode, control is
passed between operators less often as rows can be handed off in fewer batches.
SELECT
Employee,
[WWI Employee ID],
97
© Edward Pollack 2022
E. Pollack, Analytics Optimization with Columnstore Indexes in Microsoft SQL Server,
https://doi.org/10.1007/978-1-4842-8048-5_7
Chapter 7 Batch Execution
[Preferred Name],
[Is Salesperson]
FROM Dimension.Employee
WHERE [Employee Key] = 17;
The result of this query is a single row, as it seeks a single ID value by the primary key
in the table. Turning on the actual execution plan, the details for the clustered index seek
can be viewed, as seen in Figure 7-1.
Note the execution modes provided in the query operator details. Row mode is
indicated for both the estimated and actual execution modes. Given that the underlying
table is a rowstore table and that the query only returns a single row, execution via row
mode is expected for this operation.
While columnstore indexes are optimized for analytic queries that operate against
large row counts and will usually take advantage of batch mode processing, row mode can
be chosen as the execution mode if the optimizer believes that is the most efficient option.
The query in Listing 7-2 shows a narrow query executed against a columnstore
indexed table that also happens to have a nonclustered rowstore primary key.
98
Chapter 7 Batch Execution
Listing 7-2. Query That Uses Row Mode Execution Against a Columnstore Index
SELECT
*
FROM fact.Sale
WHERE [Invoice Date Key] = '1/1/2016'
AND [Sale Key] = 198840;
This query returns a single row against a clustered columnstore index in which row
mode was used, as shown in Figure 7-2.
Figure 7-2. Execution plan with row mode execution against a columnstore index
While the execution plan operator indicates it is using columnstore storage, the
optimizer chooses row mode as the execution mode. This should not be seen as
unusual or suboptimal. Because the query only returns a single row, the use of row
mode is optimized for that expected outcome, even if the table is stored in a clustered
columnstore index.
99
Chapter 7 Batch Execution
This query returns only a single row, but processes many rows to crunch these
metrics. The execution plan in Figure 7-3 shows the details for this query.
Figure 7-3. Execution plan with batch mode execution on a columnstore index
100
Chapter 7 Batch Execution
Batch is the expected execution mode and is what SQL Server chose for this query.
Note that the plan details indicate that the “Actual Number of Locally Aggregated Rows”
is 29,458. This is the number of rows required by SQL Server to satisfy the query, which is
confirmed in the results shown in Figure 7-4.
Whether batch mode is chosen by the query optimizer for a given plan operator
depends on the number of rows processed by that operator and not on the number of
rows returned by a query.
Execution modes are not all-or-nothing choices for a query. The query optimizer can
choose batch mode for some operators and row mode for others and will do so based on
whatever mode it determines is most efficient for each.
A common analytic query pattern is to aggregate data from a large columnstore
index and join into dimension tables to provide lookup values where needed. The query
in Listing 7-4 illustrates a classic scenario in which both dimension and fact tables are
queried together.
Listing 7-4. Query That Joins a Large Analytic Table with a Small Lookup Table
SELECT
City.City,
City.[State Province],
City.Country,
COUNT(*)
FROM Fact.Sale
INNER JOIN Dimension.City
ON City.[City Key] = Sale.[City Key]
WHERE [Invoice Date Key] >= '1/1/2016'
AND [Invoice Date Key] < '2/1/2016'
GROUP BY City.City, City.[State Province], City.Country;
101
Chapter 7 Batch Execution
Figure 7-5. Execution plan for a query against fact and dimension tables
Figure 7-6. Operator properties for columnstore and rowstore index scans
The columnstore index scan operates in batch mode, whereas the rowstore index
scan operates in row mode. This query was executed against a database with SQL Server
2016 compatibility level (130). To test the effects of batch mode on rowstore usage,
the compatibility level is adjusted to 150 (SQL Server 2019), as shown by the query in
Listing 7-5.
102
Chapter 7 Batch Execution
Listing 7-5. Query to Alter the Database Compatibility Level to SQL Server 2019
After this change, the query in Listing 7-4 is executed again, with the execution plan
shown in Figure 7-7.
When the compatibility level is switched to 150, multiple new features in SQL Server
2019 are enabled. The execution plan shows the use of adaptive joins, where the join to
Dimension.City is given the option of either a clustered index scan or a clustered index
seek. The actual row counts for each operator confirm that SQL Server chose to use the
clustered index scan, due to the large number of rows returned. Figure 7-8 shows the
operator properties for the clustered index scan against Dimension.City.
103
Chapter 7 Batch Execution
Figure 7-8. Operator properties for the rowstore clustered index scan
With SQL Server 2019 features available, batch mode is chosen as the execution
mode for the rowstore clustered index scan. For a query scanning many rows, this makes
perfect sense.
Starting in SQL Server 2019, batch mode can be used by SQL Server against rowstore
tables, but will be limited to scenarios when row counts are higher and using batch
mode is expected to improve performance. Prior to SQL Server 2019, batch mode was
only available for columnstore indexes. This is a significant improvement as it can greatly
improve analytic query performance against rowstore tables.
This leads to an immediate question: If batch mode on rowstore tables can improve
analytic performance, are columnstore indexes still useful? If batch mode were the
only feature that made columnstore indexes highly performant, then that would be
a valid question. Batch mode on rowstore tables does not provide the other benefits
104
Chapter 7 Batch Execution
Figure 7-9. Sample row mode execution plan with row counts
In the execution plan, a clustered index scan is retrieving 276 rows that are passed
into an aggregation that results in a single row that is in turn returned by the query. In
row mode, each of the 276 rows is individually transferred from the clustered index scan
operator to the stream aggregate operator.
When used, batch mode processes multiple rows together, storing the resulting data
structure as a set of vectors in memory. This results in a completely new way of data
being passed from operator to operator in an execution plan. Each batch can consume
up to 64 kilobytes and contain between 64 and 900 rows. The row batch can be visualized
in Figure 7-10.
105
Chapter 7 Batch Execution
A row batch looks somewhat similar to the structure of a columnstore index and
provides similar benefits in memory as a query is executed. The qualifying rows vector
performs a similar function to the columnstore index’s delete bitmap in that it flags rows
that are no longer logically needed by the batch. For example, if a filter is applied to a
data set, it can be processed solely by updating the qualifying rows vector for the rows
that are to be filtered out.
Batch mode processing reduces CPU consumption by decreasing the number of
times that data is transferred between execution plan operators. For example, 5000 rows
processed via row mode processing will require at least 5000 CPU operations to move it
between plan operators. The same data set processed via batch mode might assign 500
rows each to ten batches. In this scenario, data can be passed between execution plan
operators using 5000/500 = 10 operations. These are approximations, but are illustrative
for how batch mode and row mode processing affect CPU consumption.
Another boon to batch mode processing is the ease in which it can take advantage of
parallelism. Batches can be processed in parallel, allowing a multicore CPU architecture
with available CPU capacity to process query plan operators more efficiently. Parallelism
is a process that requires some CPU overhead to utilize. Therefore, the amount of rows
involved in parallelism needs to be significant enough that the benefits of breaking a
workload into smaller chunks outweigh the cost to do so. Row mode does not as easily
benefit from parallelism in this regard as the effort to split single row operations into
separate parallel operations and then combine them back together is far greater than the
trivial savings afforded by that process.
106
Chapter 7 Batch Execution
While batch mode may not always be chosen for rowstore tables, it will be the default
choice for columnstore indexes. Since a rowgroup will contain up to 220 rows, the volume
of data that needs to be processed by a columnstore index scan will be large enough to
ensure that SQL Server chooses batch mode as the likely candidate to provide the best
performance for queries against that data.
SELECT
Sale.[City Key],
COUNT(*)
FROM fact.Sale
GROUP BY Sale.[City Key]
SELECT
Sale.[City Key],
COUNT(*)
FROM fact.Sale
GROUP BY Sale.[City Key]
107
Chapter 7 Batch Execution
The execution plans that result from these sample queries provide an immediate
clue that a significant performance difference occurred here, as seen in Figure 7-11.
The two execution plans appear nearly identical, except that the query cost for the
first is 90%, whereas the second query is 10%. In addition, aggregate pushdown occurred,
allowing the GROUP BY in our query to be processed in line with the columnstore index
scan. This prevented the need to push all 228265 rows one at a time on to the hash match
operation. Figure 7-12 compares the details for each columnstore index scan.
108
Chapter 7 Batch Execution
Figure 7-12. Execution plan details for row mode vs. batch mode operation
There are a handful of notable differences between the expensive execution plan on
the left and the efficient execution plan on the right:
• Batch mode was used successfully to process 228265 rows in the
second plan.
• CPU consumption using batch mode is nearly ten times less than
using row mode.
• Actual Number of Locally Aggregated Rows documents aggregate
pushdown.
Making use of batch mode allows SQL Server to also take advantage of aggregate
pushdown. The combination of these features was what allowed for CPU to be so greatly
reduced.
This raises an important aspect of batch mode processing that further increases its
effectiveness on analytic workloads: batch mode enables other powerful performance-
improving features to be used when a query is optimized. Some of these features (along
with earliest SQL Server versions they were enabled) include
• Adaptive joins (SQL Server 2017)
• Memory grant feedback (SQL Server 2017)
• Aggregate pushdown (SQL Server 2016)
109
Chapter 7 Batch Execution
Generally speaking, when SQL Server chooses to use batch mode, performance
will be improved with its usage. Similarly, when additional intelligent query processing
features are used (such as adaptive joins or aggregate pushdown), they will also have a
positive impact on performance.
Testing batch mode vs. row mode processing can be challenging as forcing a query to
use one over the other is not always straightforward. One way to make this testing easier
is to use database scoped configuration changes to temporarily disable these features.
The T-SQL in Listing 7-7 provides the syntax needed to disable batch mode on rowstore,
as well as batch mode memory grant feedback and batch mode adaptive joins.
Note that these features should only be disabled for testing purposes only and
should not be turned off in a production environment unless there are exceptional and
well-documented reasons to do so.
Compatibility levels may also be adjusted for testing and demonstration purposes.
This can help model the impact of an upgrade or to allow a SQL Server upgrade
to functionally occur at a slower pace over time. By incrementally adjusting the
compatibility level up by one level at a time from the original SQL Server version to the
new version, risk can be isolated and mitigated in steps, rather than all at once. This
process also provides a rollback mechanism, if needed, since compatibility modes can
be lowered, in the event of an emergency.
110
CHAPTER 8
111
© Edward Pollack 2022
E. Pollack, Analytics Optimization with Columnstore Indexes in Microsoft SQL Server,
https://doi.org/10.1007/978-1-4842-8048-5_8
Chapter 8 Bulk Loading Data
Analytic workloads differ greatly in their recovery needs as data loads tend to
be larger, less frequent, and asynchronous. When analytic data loads fail, the most
common recourse is to investigate the problem, resolve it, and then rerun the data load.
Point-in-time recovery within data load processes is less important than simply having
recovery available to a point in time prior to the data load. Therefore, OLAP data loads
can benefit greatly from minimally logged insert operations.
Outside of columnstore indexes, bulk loading data is limited to a handful of write
operations, such as
• BCP
• Partition switching
112
Chapter 8 Bulk Loading Data
Unlike some of the other types of minimally logged insert operations in SQL Server,
there are no prerequisites to take advantage of bulk loading data into a columnstore index.
There is no need to adjust isolation levels, use explicit table locking, or adjust parallelism
settings. Since bulk loading data will be the desired operation for large inserts into
columnstore indexes, it is the default and will be used automatically when possible.
If an insert operation contains more than 220 (1,048,576) rows, it will be subdivided
into inserts in the following fashion:
1. Each batch of 220 rows will be bulk inserted into the columnstore
index until there are less than 220 rows remaining to insert.
2. If the remainder of rows is greater than or equal to 102,400, it will
also be bulk inserted into the columnstore index.
SQL Server can bulk load data into a columnstore index in multiple parallel threads
so long as the data for each thread is targeting a different data file. This is automatic and
requires no particular user action to accomplish. Columnstore bulk insert operations
acquire exclusive locks against target rowgroups, and so long as parallel inserts target
separate data files, they are guaranteed to not overlap the rowgroups they are inserting into.
When data is inserted into a partitioned columnstore index, that data is first assigned
a partition and then each group of rows is inserted into their respective partitions.
Therefore, whether bulk load processes are used will be dependent on the numbers of
rows inserted into the target partition, rather than the total number of rows inserted
by the source query. Typically, analytic tables will have a current/active partition that
accepts new data, whereas the remaining partitions will contain older data that is no
longer written to (outside of one-off data loads, software releases, or maintenance
events).
114
Chapter 8 Bulk Loading Data
The insert takes about 1 second. With the rows inserted, an undocumented but
useful system function will be used to read the contents of the transaction log and
determine the size of the transaction, as seen in Listing 8-2.
The result is a single row that indicates the total log size for the insert transaction, as
seen in Figure 8-2.
Figure 8-2. Transaction size for an insert of 102,400 rows into a clustered
rowstore index
115
Chapter 8 Bulk Loading Data
The log size returned for the rowstore insert was 225,856 bytes, or about 220KB. The
same insert will now be performed against a clean copy of the table with a columnstore
index, as seen in Listing 8-3.
This insert takes less than a second to complete. A new clean table is used to ensure
that there are no residual transactions from previous demonstrations that would pollute
the log. The results from fn_dblog() are seen in Figure 8-3.
Figure 8-3. Transaction size for an insert of 102,400 rows into a clustered
columnstore index
Note that the transaction log size for the insert is 118,192 bytes, or about 115KB. This
constitutes a significant reduction in transaction size when compared to a page
compressed rowstore index.
With the impact that bulk loading can have on transaction size demonstrated, it is
important to illustrate the difference between inserting 102,400 rows into a columnstore
index and inserting 102,399 rows. The T-SQL in Listing 8-4 inserts 102,399 rows into a
newly created clustered columnstore index.
116
Chapter 8 Bulk Loading Data
This takes about a second to execute. The query to pull data from fn_dblog() needs to
be adjusted slightly to accommodate the log growth due to both the columnstore index
itself and the delta store. This is shown in Listing 8-5.
Listing 8-5. Query to Calculate Log Growth for Columnstore Index and
Delta Store
SELECT
fn_dblog.allocunitname,
SUM(fn_dblog.[log record length]) AS log_size
FROM sys.fn_dblog (NULL, NULL)
WHERE fn_dblog.allocunitname IN (
'Fact.Sale_CCI_Clean_Test_2.CCI_Sale_CCI_Clean_Test_2',
'Fact.Sale_CCI_Clean_Test_2.CCI_Sale_CCI_Clean_Test_2(Delta)')
GROUP BY fn_dblog.allocunitname;
Note that the delta store needs to be referenced separately to be included in the
results. The transaction log space consumed by each object is seen in Figure 8-4.
117
Chapter 8 Bulk Loading Data
Figure 8-4. Transaction size for an insert of 102,399 rows into a clustered
columnstore index
The insert into the delta store was significantly more expensive than any operation
demonstrated thus far in this chapter, with a transaction size of 1,300,448, or about
1.2GB. As a result, there is a significant incentive to take advantage of bulk loading data
into columnstore indexes when possible.
save enough resources to be meaningful. Actively avoiding the insert of repeated small
batches will ensure that insert performance is quite good. If the delta store is needed for
the tail end of a data load process, then there is no reason to avoid it.
119
Chapter 8 Bulk Loading Data
As a planned maintenance event, these are relatively innocuous tasks that should
not consume an unusual amount of time, even on a larger table.
Listing 8-6. Query to Return Details About the Structure of a Columnstore Index
SELECT
tables.name AS table_name,
indexes.name AS index_name,
partitions.partition_number,
column_store_row_groups.row_group_id,
column_store_row_groups.state_description,
column_store_row_groups.total_rows,
column_store_row_groups.size_in_bytes,
column_store_row_groups.deleted_rows,
internal_partitions.internal_object_type_desc,
internal_partitions.rows,
internal_partitions.data_compression_desc
FROM sys.column_store_row_groups
INNER JOIN sys.indexes
ON indexes.index_id = column_store_row_groups.index_id
AND indexes.object_id = column_store_row_groups.object_id
INNER JOIN sys.tables
ON tables.object_id = indexes.object_id
INNER JOIN sys.partitions
ON partitions.partition_number = column_store_row_groups.partition_number
AND partitions.index_id = indexes.index_id
AND partitions.object_id = tables.object_id
120
Chapter 8 Bulk Loading Data
Note that the entire contents of the columnstore index (102,399 rows) reside in the
delta store. The delete bitmap exists as a default and is currently empty. If the operator
wishes to move the contents of the delta store into columnstore rowgroups, this can be
accomplished by the index maintenance command in Listing 8-7.
Once complete, the delta store would be compressed and ready to move into the
columnstore index, as seen in Figure 8-6.
The results show that the delta rowgroup affected by the index maintenance is
now in an intermediary state. A new object has been created and compressed with
the contents of the delta rowgroup inserted into it. The previous delta rowgroup (an
uncompressed heap) is left in a tombstone state for the tuple mover to remove at a
later point in time. Note the significant size difference between an uncompressed delta
rowgroup and a compressed rowgroup.
121
Chapter 8 Bulk Loading Data
At this time, running the same ALTER INDEX statement as before will force the
tuple mover to complete this cleanup. Alternatively, waiting for a short amount of time
to pass will achieve the same results. After 5 minutes have passed, the contents of this
columnstore index are as seen in Figure 8-7.
Figure 8-7. Columnstore index contents after the tuple mover executes
Once the tuple mover completes its cleanup process, all that remains is a single
compressed columnstore rowgroup.
Performing maintenance like this is not necessary but can improve query read
speeds against a columnstore index after a data load is completed. Index maintenance
will be discussed in far more detail in Chapter 14, including its use as part of data loads
and other columnstore processes.
S
ummary
When loading data into a columnstore index, bulk loading ensures the fastest possible
load speeds while minimizing server resource consumption. Bulk loading large numbers
of rows in fewer batches is far more efficient than using trickle or small batch inserts.
By focusing data load processing around maximizing the use of compressed
columnstore rowgroups and minimizing delta store usage, overall columnstore index
performance can be improved, both for the data load processes and analytics that use
the newly loaded data.
122
CHAPTER 9
The sample data is encoded with dictionary compression, reordered with Vertipaq
optimization, and further compressed with run-length encoding. If a process deletes the
last two rows in the table, then it is necessary to decompress the data fully, delete the
rows, and then recompress it. Figure 9-2 shows how this process would impact the data.
The resulting set of indexed ID groups contains only 4 rows, rather than 5 (as index
ID 4 has been removed), and the count for the index ID 3 has been reduced.
If the table were larger and the deletion impacted more rows, then it is likely that
many rowgroups would need to be decompressed, adjusted, and recompressed. In the
process, many pages would need to be updated. This operation would quickly grow to be
prohibitively expensive. A balance needs to be maintained between the speed of write
operations and the speed of read operations, and in this scenario, the ability to load and
modify data quickly needs to be prioritized.
124
Chapter 9 Delete and Update Operations
D
elete Operations
In a columnstore index, the cost to decompress rowgroups, delete rows, and recompress
them is prohibitively high. The more rowgroups a delete operation targets, the greater
this cost would become. To mitigate this cost, a structure called the delete bitmap is used
to track deletions from the columnstore index.
The delete bitmap is a heap that references the rows within underlying rowgroups.
When a row is deleted, the data within the rowgroup remains unchanged and the
delete bitmap is updated with a reference to the deleted row. As such, deleted rows in
columnstore indexes can be seen as soft deleted, where removed rows are flagged, but
not physically deleted.
Note that delete operations against rows in the delta store do not need to use the
delete bitmap as the delta store is a rowstore heap and rows can simply be deleted from
it as needed, without the need for soft deletes.
When a query is executed against a rowgroup containing deleted rows, the delete
bitmap is consulted and deleted rows are excluded from the results. The delete bitmap
may be visualized as seen in Figure 9-3.
Figure 9-3. The delete bitmap and its relationship to a columnstore index
The underlying rows within the rowgroup are unchanged when a deletion occurs
against them. Instead, the delete bitmap tracks which rows are deleted and is consulted
when future queries are issued against this rowgroup. Because of this, deleting data in a
columnstore index will not reclaim storage space.
125
Chapter 9 Delete and Update Operations
DELETE
FROM Fact.Sale_CCI
WHERE [Invoice Date Key] = '1/1/2016';
When executed, this query will delete all sale data that was invoiced on 1/1/2016.
Before doing so, the rowgroup metadata can be consulted to confirm that there
are currently no deleted rows in the rowgroup. The query in Listing 9-2 will return
metadata about rowgroups in this columnstore index, including the number of deleted
rows, if any.
Listing 9-2. Query That Returns Metadata About Deleted Rows in Rowgroups
SELECT
tables.name AS table_name,
indexes.name AS index_name,
partitions.partition_number,
column_store_row_groups.state_description,
column_store_row_groups.total_rows,
column_store_row_groups.size_in_bytes,
column_store_row_groups.deleted_rows
FROM sys.column_store_row_groups
INNER JOIN sys.indexes
ON indexes.index_id = column_store_row_groups.index_id
AND indexes.object_id = column_store_row_groups.object_id
INNER JOIN sys.tables
ON tables.object_id = indexes.object_id
INNER JOIN sys.partitions
ON partitions.partition_number = column_store_row_groups.partition_number
AND partitions.index_id = indexes.index_id
AND partitions.object_id = tables.object_id
WHERE tables.name = 'Sale_CCI'
ORDER BY indexes.index_id, column_store_row_groups.row_group_id;
126
Chapter 9 Delete and Update Operations
Figure 9-4. Rowgroup metadata for a columnstore index, including deleted rows
Each rowgroup now contains some number of deleted rows, depending on the
number of rows within them that happen to have been invoiced on 1/1/2016. Note that
the total rows in each rowgroup and the size in bytes have not changed. This is expected
and reflects the fact that rows were soft deleted via the delete bitmap.
127
Chapter 9 Delete and Update Operations
Because rows were removed from all rowgroups in the index, a deletion without the
aid of the delete bitmap would have required a rebuild of the entire index, which would
be a prohibitively expensive operation to perform while a process waits for a single day’s
worth of rows to be deleted. Instead, the delete of 15,400 rows completed exceptionally
quickly, in under a second, thanks to the delete bitmap!
The only way to clean up deleted rows and free up the space consumed within
rowgroups is to perform an index rebuild on the columnstore index or on any partitions
impacted by the deletion. This is generally not needed unless the amount of deleted data
becomes large with respect to the overall size of the index. Like with classic rowstore
indexes, fragmentation is not problematic until there is too much of it, at which time a
rebuild can resolve it. Note that index reorganize operations do not remove deleted rows
from columnstore indexes! Chapter 14 dives into index maintenance in detail, discussing
how and when it is needed, and best practices for its use.
U
pdate Operations
In a columnstore index, an update is executed as two operations: a delete and an insert.
While logically, the update will perform as a single atomic unit, under the covers it will
consist of
This means that an update operation will need to write to both the delete bitmap
and the delta store in order to complete successfully. It is also important to note that the
insert operations that result from an update operation against a columnstore index will
exclusively use the delta store and cannot take advantage of bulk load processes.
Before continuing, a rebuild will be executed against the columnstore index to allow
for easier visualization of the results. The query in Listing 9-3 will rebuild the index.
Listing 9-3. Query to Rebuild a Columnstore Index, Removing the Delete Bitmap
With the data now clean, the query in Listing 9-4 can be used to return metadata
about all rowgroups in the columnstore index, including both the delta store and
delete bitmap.
128
Chapter 9 Delete and Update Operations
Listing 9-4. Query to Return Delete Bitmap and Delta Store Metadata for
Rowgroups in a Columnstore Index
SELECT
tables.name AS table_name,
indexes.name AS index_name,
partitions.partition_number,
column_store_row_groups.row_group_id,
column_store_row_groups.state_description,
column_store_row_groups.total_rows,
column_store_row_groups.size_in_bytes,
column_store_row_groups.deleted_rows,
internal_partitions.internal_object_type_desc,
internal_partitions.rows
FROM sys.column_store_row_groups
INNER JOIN sys.indexes
ON indexes.index_id = column_store_row_groups.index_id
AND indexes.object_id = column_store_row_groups.object_id
INNER JOIN sys.tables
ON tables.object_id = indexes.object_id
INNER JOIN sys.partitions
ON partitions.partition_number = column_store_row_groups.partition_number
AND partitions.index_id = indexes.index_id
AND partitions.object_id = tables.object_id
LEFT JOIN sys.internal_partitions
ON internal_partitions.object_id = tables.object_id
WHERE tables.name = 'Sale_CCI'
ORDER BY indexes.index_id, column_store_row_groups.row_group_id;
The results in Figure 9-6 show a clean columnstore index with no deleted rows or
entries in the delta store.
129
Chapter 9 Delete and Update Operations
With a pristine columnstore index available, the effects of update operations can be
easily visualized against it. Consider the query shown in Listing 9-5.
SELECT
*
FROM Fact.Sale_CCI
WHERE [Invoice Date Key] = '1/2/2016';
This SELECT query identifies a total of 18,150 rows in the table that match the filter
criteria of an invoice date of 1/2/2016. Next, an update will be made against two columns
in the table, as shown in Listing 9-6.
UPDATE Sale_CCI
SET [Total Dry Items] = [Total Dry Items] - 1,
[Total Chiller Items] = [Total Chiller Items] + 1
FROM Fact.Sale_CCI
WHERE [Invoice Date Key] = '1/2/2016';
Returning to the rowgroup metadata query in Listing 9-4, the results of the UPDATE
statement can be reviewed, with a sample seen in Figure 9-7.
130
Chapter 9 Delete and Update Operations
The metadata after the UPDATE was executed shows that every rowgroup in the
columnstore index was impacted as 18,150 rows were deleted and 18,150 rows were
then inserted. Sys.internal_partitions shows a delete bitmap and delta store object,
each containing 18,150 total rows. The rowgroup detail illustrates how many rows were
updated per rowgroup. In addition, the new rowgroup (number 25) shows the new open
delta store that was created for the newly inserted rows.
The resulting structure underscores the fact that an UPDATE against a columnstore
index executes as a combination of discrete delete and insert operations. Performing
each of those operations sequentially would yield similar results.
Consider an update to all rows in this columnstore index for the range of invoice
dates from 1/3/2016 up to 1/8/2016. A count of these rows shows a total of 148,170
that match that date filter. Before running an update, the columnstore index will be
rebuilt, which will clean up the deleted rows and delta stores, allowing for a better
demonstration. Listing 9-7 provides the query to rebuild the index and then update
these rows.
UPDATE Sale_CCI
SET [Total Dry Items] = [Total Dry Items] - 1,
[Total Chiller Items] = [Total Chiller Items] + 1
FROM Fact.Sale_CCI
WHERE [Invoice Date Key] >= '1/3/2016'
AND [Invoice Date Key] < '1/8/2016';
131
Chapter 9 Delete and Update Operations
When updating 148,170 rows, the first thing to note is that it took 5 full seconds to
execute! The previous update of 18,150 rows completed almost instantly after being
executed. Viewing the metadata reveals the reason for this, as seen in Figure 9-8.
The key takeaway from the metadata following the larger update is that the delta
store contains all 148,170 rows updated in the operation. Normally, any INSERT
operations of 102,400 rows or more will benefit from a minimally logged bulk insert
process, but UPDATE operations cannot benefit from it. Because the UPDATE is
composed of both an INSERT and DELETE in the same transaction, it is not possible to
fork the single transaction into a fully logged DELETE and a minimally logged INSERT.
Bulk insert processes on columnstore indexes can save immense resources when
used, but are not allowed to violate the ACID (Atomic, Consistent, Isolated, and Durable)
properties of a SQL Server database. Attempting to splice together a fully logged and
minimally logged transaction into a single larger transaction would require the creation
of a transaction that would provide point-in-time restore capabilities for the DELETE,
but not the INSERT. While it is possible to conceive of ways for SQL Server to architect its
way around that limitation, doing so would be confusing to anyone using a columnstore
index and could result in unexpected results when restoring databases containing
columnstore indexes that are subject to frequent UPDATE operations.
The key takeaway is that an UPDATE against a columnstore index will be comprised
of a DELETE operation and a fully logged INSERT into the delta store. As the number of
rows updated increases, the performance of those operations will decrease dramatically.
The delta store was built to handle small numbers of rows – either trickle loads or the
residual rows from a larger data load process. It was not built for large volumes of rows at
one time and as a matter of course will perform poorly for those applications.
132
Chapter 9 Delete and Update Operations
When executed, the result is 1,159,180 rows counted. This is greater than the
1,048,576 rows that can be stored in a single columnstore rowgroup. Listing 9-9 rebuilds
the index once more and then performs an update against all 1,159,180 rows identified
in Listing 9-9. This single UPDATE operation will result in the deletion of old rows, the
insertion of new rows into the delta store, and the compression of most of the new rows
into rowgroups.
UPDATE Sale_CCI
SET [Total Dry Items] = [Total Dry Items] - 1,
[Total Chiller Items] = [Total Chiller Items] + 1
FROM Fact.Sale_CCI
WHERE [Invoice Date Key] >= '1/8/2016'
AND [Invoice Date Key] < '3/5/2016';
The update of 1,159,180 rows took almost a minute to complete. Figure 9-9 shows
the resulting columnstore metadata immediately after this large UPDATE operation
completes.
133
Chapter 9 Delete and Update Operations
Note that there are multiple delta stores present. The open delta store (number 26)
will remain open to accept future inserted data. The closed delta store will be processed
by the tuple mover asynchronously and be compressed into a permanent columnstore
rowgroup. Figure 9-10 shows the metadata for this columnstore index after a minute has
passed and the tuple mover has executed.
Figure 9-10. Metadata for rowgroups after updating 1,159,180 rows and allowing
the tuple mover to execute
134
Chapter 9 Delete and Update Operations
3. 110,604 rows were inserted into another delta store and remain in
an open state awaiting future inserted rows.
4. The 1,048,576 rows in the full delta store are processed by the
tuple mover and compressed into a permanent rowgroup.
This is a significant amount of work for a single UPDATE statement, and it scales
poorly with increasing row counts.
In general, updates should be limited to small row counts or avoided altogether.
There is value in rethinking how code is written to refactor an UPDATE against
a columnstore index into a set of deletes and inserts or to manage an update via
intermediary processes that avoid it altogether. The performance of UPDATE operations
on columnstore indexes will be unpredictable and a large enough row count will result
in transactions large enough to create resource pressure on the SQL Server.
Chapter 15 (best practices) will discuss in greater detail how to avoid updates and a
variety of tactics that can be used when migrating UPDATE code from rowstore indexes
into columnstore indexes.
135
CHAPTER 10
S
egment Elimination
Each segment represents data for a single column over a set of rows. When a query is
executed against a columnstore index, SQL Server needs to determine which rowgroups
are required to return the requested result set. Within those rowgroups, only the
segments containing data for the selected columns will be read. Therefore, queries that
select less columns will read fewer segments, thereby reducing IO, memory usage, and
query runtimes.
137
© Edward Pollack 2022
E. Pollack, Analytics Optimization with Columnstore Indexes in Microsoft SQL Server,
https://doi.org/10.1007/978-1-4842-8048-5_10
Chapter 10 Segment and Rowgroup Elimination
Figure 10-1. Sample columnstore index with six rowgroups and eight columns
138
Chapter 10 Segment and Rowgroup Elimination
By querying for only two of the possible eight columns, the number of segments
read was reduced by 75%. This convention holds true for columnstore indexes with any
number of columns. If this table had 24 columns, then reading only 2 of them would
mean that 22 columns would not be read. The result of this feature is that the number
of segments read in a columnstore index query will be proportional to the number of
columns involved in the query.
139
Chapter 10 Segment and Rowgroup Elimination
The columns needed to satisfy a query also include any referenced by the WHERE
clause, as well as in aggregates (GROUP BY/HAVING). If a query includes views
or functions, then their contents will be evaluated to determine which columns
are required to execute a query. Listing 10-1 provides an example query against a
columnstore index.
SELECT
[City Key],
COUNT(*)
FROM fact.Sale_CCI
WHERE [Invoice Date Key] >= '1/1/2016'
GROUP BY [City Key]
ORDER BY COUNT(*) DESC;
This analytic query calculates a count of sales per city for a given time period. Of the
21 columns in Sale_CCI, only 2 were required to complete this calculation: City Key and
Invoice Date Key. As a result, 19 of the 21 column’s worth of segments will be ignored
when executing this query.
Alternatively, rowstore tables store rows sequentially with each column for each row
stored together on pages. While reading less columns in rowstore indexes can reduce
the amount of data presented to an application, it does not reduce the quantity of pages
read into memory in order to retrieve the specific columns for a query. Therefore,
reducing the number of columns queried in a columnstore index will provide immediate
performance benefits that are not as pronounced in rowstore indexes.
Segment elimination is a simple and powerful tool that can be simplified into a
single optimization technique: Write queries to only include the required columns.
Since rowgroups can contain up to 220 rows, the cost of querying unnecessary columns
can be quite high. Whereas transactional queries against rowstore tables often operate
on small numbers of rows, analytic queries can access millions of rows at a time.
Therefore, the perceived convenience of SELECT * queries will hinder performance on a
columnstore index.
Consider the query in Listing 10-2.
140
Chapter 10 Segment and Rowgroup Elimination
SELECT
*
FROM fact.Sale_CCI
WHERE [Invoice Date Key] = '2/17/2016';
This SELECT * query returns all columns for sales on a specific date. While only
30,140 rows are returned, all 21 columns need to have their segments retrieved as part of
the operation. Figure 10-3 provides the output of STATISTICS IO for this query.
Note that a total of 68,030 LOB logical reads are recorded. Listing 10-3 contains an
alternative query in which only a small set of columns required by an application are
returned.
Here, only three of the columns are requested, instead of all of them. The resulting
IO can be viewed in Figure 10-4.
Figure 10-4. IO for a sample query requesting data from three columns
141
Chapter 10 Segment and Rowgroup Elimination
When the column list was reduced to only those needed for the query, LOB logical
reads were reduced from 68,030 to 31,689. The performance improvement for omitting
columns will vary based on the data type, compression, and contents of each column.
Omitting a text column with no repeated values will reduce IO far more than omitting a
BIT column.
Segment elimination is an easy way to improve query speeds while also reducing
resource consumption. Anyone writing queries against columnstore indexes should
consider which columns are required for their workloads and ensure that no extra
columns are returned.
R
owgroup Elimination
Unlike rowstore indexes, columnstore indexes have no built-in order. The data that is
inserted into a columnstore index is added in the order it is received by SQL Server.
As a result, the order of data within rowgroups is the direct result of the order it was
inserted. Equally important is that UPDATE operations will reorder a columnstore index,
removing rows from one set of rowgroups and inserting the new versions into the most
current open rowgroups.
Because compressing rowgroups is a computationally expensive process, the cost to
intrinsically maintain any form of data order would be prohibitively high. This is one of
the most important concepts when architecting a columnstore index. Since data order
is not enforced by SQL Server, it is the responsibility of the architect to determine data
order up front and ensure that both data load processes and common query patterns
maintain that agreed-upon data order.
Consider the table Dimension.Employee that contains a clustered rowstore index on
the Employee Key column. If three new employees start work and are added to the table,
the INSERT operations to add them could be represented by the T-SQL in Listing 10-4.
142
Chapter 10 Segment and Rowgroup Elimination
289, N'Ebenezer Scrooge', N'Scrooge', 0, NULL, GETUTCDATE(),
'9999-12-31 23:59:59.9999999', 3),
( 213, -- Clustered Index
400, N'Captain Ahab', N'Captain', 0, NULL, GETUTCDATE(), '9999-12-31
23:59:59.9999999', 3),
( 1017, -- Clustered Index
501, N'Holden Caulfield', N'Phony', 0, NULL, GETUTCDATE(), '9999-12-31
23:59:59.9999999', 3);
There are three rows inserted into the table, with clustered index ID values of -1, 213,
and 1017. When inserted, SQL Server will place each row in the b-tree index in order
with the rest of the rows, based on those Employee Key values. As a result, the table will
remain ordered by the clustered index after the INSERT operation.
Imagine for a moment that this table did not have a rowstore index, but instead had
a clustered columnstore index. In that scenario, the three rows would be inserted at the
end of the open rowgroup(s) without any regard to the value of Employee Key. A query
that searches for a specific range of IDs will need to examine any rowgroup that contains
the IDs.
Columnstore metadata helps SQL Server locate rows based on the range of values for
each column present in each rowgroup. Consider the metadata for the Invoice Date Key
column in Fact.Sale_CCI using the query in Listing 10-5.
143
Chapter 10 Segment and Rowgroup Elimination
Figure 10-5. Metadata for the Invoice Date Key column of a columnstore index
Note that min_data_id and max_data_id are identical for each rowgroup. This means
that the data contained for that column is unordered. If queries commonly filtered
using Invoice Date Key, they would need to scan all rowgroups in the columnstore index
in order to appropriately filter out the requested rows. As a columnstore index grows
over time, the cost to scan all rowgroups will become burdensome. Even on a well-
compressed columnstore index, queries will become slow, and the IO required to service
an unfilterable query will be high.
STATISTICS IO provides a useful guide to the number of rowgroups read as part of a
columnstore index scan. To demonstrate this, the query in Listing 10-6 will be used.
144
Chapter 10 Segment and Rowgroup Elimination
SELECT
SUM([Quantity])
FROM Fact.Sale_CCI
WHERE [Invoice Date Key] >= '1/1/2016'
AND [Invoice Date Key] < '2/1/2016';
This is a classic analytic query that calculates the total sales quantity for a
given month.
145
Chapter 10 Segment and Rowgroup Elimination
While this hypothetical index contains six rowgroups, only a single one is required to
satisfy the filter criteria of the query. The power of rowgroup elimination is that it scales
effectively as a columnstore index grows in size. A query that requests a narrow week of
analytic data from a table with a month of data will perform similarly to that same query
against a table with 10 years of data. This is the primary feature that allows columnstore
indexes to scale effectively, even when billions of rows are present in a table.
In the example in Listing 10-6, a simple analytic query filtered on Invoice Date Key,
but the unordered columnstore index data forced a full scan of the data to determine
which rows met the filter criteria. If Invoice Date Key is the most common filter criteria
for analysis of this data, then ordering by that column would allow for effective rowgroup
elimination.
146
Chapter 10 Segment and Rowgroup Elimination
147
Chapter 10 Segment and Rowgroup Elimination
Note that the data in the table created in Listing 10-7 is identical to the data
demonstrated earlier in this chapter, but has been subject to a clustered rowstore index
prior to being given a columnstore index. This additional step ensures that the initial
data set is ordered by Invoice Date Key. MAXDOP is intentionally set to 1 to avoid
parallelism as parallel threads may risk inserting data into the columnstore index in
multiple ordered streams rather than a single ordered stream.
Going forward, new data would be regularly inserted into this table via standard
data load processes. Assuming the new data contains the most recent values for
Invoice Date Key, then the columnstore index will remain ordered in the future as new
data is added to it.
To test the impact of data order on Fact.Sale_CCI_ORDERED, the query from
Listing 10-6 will be executed against it, with the output tab displayed in Figure 10-8.
148
Chapter 10 Segment and Rowgroup Elimination
Instead of reading every rowgroup in the columnstore index, SQL Server only
needed to read two of them, with the remainder being skipped. Skipped segments in
STATISTICS IO indicate that rowgroup elimination is being successfully implemented.
The metadata query in Listing 10-5 can also be rerun against this ordered table to
illustrate how data order impacts columnstore metadata, with the results being shown in
Figure 10-9.
Figure 10-9. Metadata for the Invoice Date Key column of an ordered
columnstore index
The values of min_data_id and max_data_id for each rowgroup show a drastic
change from that of an unordered columnstore index. Instead of the values being the
same for each rowgroup, they progress from low values to high values as data progresses
from the first rowgroup to the latter ones. To put this in perspective, if a hypothetical
query required data for a data ID of 735270, it would only need to read the rowgroups
associated with segment_id = 7 from this list of segments. Since the metadata indicates
that the remaining segments do not contain this value, they (and their associated
rowgroups) can automatically be skipped.
Ordering data by a key column is an easy way to enable rowgroup elimination,
thereby reducing query resource consumption and improving the speed of any queries
that can make use of that data order. Effective data order doesn’t just reduce reads, but
it can also save on storage space by improving the compression ratios of the underlying
data. Listing 10-8 contains a script that will retrieve the data space used by the two
columnstore indexes featured in this chapter.
149
Chapter 10 Segment and Rowgroup Elimination
Listing 10-8. Query to Retrieve Data Space Used for Two Columnstore Indexes
UPDATE #storage_data
SET reserved = LEFT(reserved, LEN(reserved) - 3),
data = LEFT(data, LEN(data) - 3),
index_size = LEFT(index_size, LEN(index_size) - 3),
unused = LEFT(unused, LEN(unused) - 3);
SELECT
table_name,
rows_used,
reserved / 1024 AS data_space_reserved_mb,
data / 1024 AS data_space_used_mb,
index_size / 1024 AS index_size_mb,
unused AS free_space_kb,
CAST(CAST(data AS DECIMAL(24,2)) / CAST(rows_used AS DECIMAL(24,2))
AS DECIMAL(24,4)) AS kb_per_row
FROM #storage_data
WHERE rows_used > 0
AND table_name IN ('Sale_CCI', 'Sale_CCI_ORDERED')
ORDER BY CAST(reserved AS INT) DESC;
150
Chapter 10 Segment and Rowgroup Elimination
The data space used for each table is dramatically different, with the ordered
table consuming less than 10% of the space that the unordered table uses. This is an
exceptionally dramatic example of how ordered data can be stored more efficiently
than unordered data. Ordered data saves space because, typically, data is more similar
to other data captured within a short timeframe of it than when compared to data that
was collected years apart. Within any application, usage patterns change over time as
new features are released, old features are retired, and user behavior changes. Because
of this, data samples will look more and more different as time passes between them.
These similarities translate into compression algorithms being able to take advantage of
a data set with less distinct values for common dimensions. This also reduces dictionary
size, which also helps to prevent dictionaries from filling up and forcing the creation of
undersized rowgroups.
Real-world data may not compress as impressively as the sample here, but expect
nontrivial savings that will have a positive impact on data load processes and on
analytics speeds. It is important to remember that saving storage space also saves
memory as data remains compressed until needed by an application. Therefore, if an
ordered data set were to decrease in size by 25%, that would result in 25% less memory
being consumed in the buffer pool by columnstore index pages. Furthermore, other
common measures of server performance such as page life expectancy and latching
would improve as smaller objects can be retrieved more quickly and will impact other
data in memory less than larger objects.
An ordered columnstore index also improves UPDATE and DELETE speeds by
allowing those operations to target less pages. For example, consider the queries on the
ordered and unordered sales tables shown in Listing 10-9.
151
Chapter 10 Segment and Rowgroup Elimination
Listing 10-9. Query to Display Sample Row Counts for Two Columnstore
Indexed Tables
SELECT COUNT(*) AS row_count FROM Fact.Sale_CCI WHERE [Invoice Date Key] =
'1/1/2015';
SELECT COUNT(*) AS row_count FROM Fact.Sale_CCI_ORDERED WHERE [Invoice Date
Key] = '1/1/2015';
The results show that in each table, the count of rows affected is identical, as seen in
Figure 10-11.
Figure 10-11. Row counts for a sample query against ordered and unordered
columnstore indexes
For each table, 33,110 rows would be affected by an update using the same filter.
Listing 10-10 provides a simple UPDATE statement against each table.
Listing 10-10. Query to Update 33,110 Rows in Two Columnstore Indexed Tables
UPDATE Sale_CCI
SET [Total Dry Items] = [Total Dry Items] - 1,
[Total Chiller Items] = [Total Chiller Items] + 1
FROM Fact.Sale_CCI_ORDERED -- Unordered
WHERE [Invoice Date Key] = '1/1/2015';
UPDATE Sale_CCI
SET [Total Dry Items] = [Total Dry Items] - 1,
[Total Chiller Items] = [Total Chiller Items] + 1
FROM Fact.Sale_CCI_ORDERED -- Ordered
WHERE [Invoice Date Key] = '1/1/2015';
The results in Figure 10-12 show the resulting IO and rowgroup usage for each
UPDATE operation.
152
Chapter 10 Segment and Rowgroup Elimination
Note the dramatic difference in IO, as well as segment reads for each operation.
Because an UPDATE consists of both a DELETE and an INSERT, SQL Server had to
perform the following tasks to complete each update:
1. Locate all rows matching the filter criteria.
2. Read all columns for all rows matching the filter criteria.
3. Mark these rows as deleted in the delete bitmap.
4. Insert new versions of these rows into the delta store.
In order to insert new versions of the updated rows, SQL Server needs to read the
existing rows in their entirety, which is not a trivial operation. Once read, those rows
are marked as deleted and the new versions are inserted into the delta store. This is an
expensive process, but ordered data allows for far fewer rowgroups to be read, thereby
reducing the work needed to set up the necessary data for the insert into the delta store.
DELETE operations are improved similarly, but require far less work as they
simply need to
1. Locate all rows matching the filter criteria.
2. Mark these rows as deleted in the delete bitmap.
For both cases, an ordered columnstore index will immensely improve UPDATE
and DELETE performance when the filter criteria honors the order used in the table.
As an added bonus, an ordered columnstore index will allow for DELETE and UPDATE
operations that cause less fragmentation. Instead of flagging rows as deleted in most (or
all) rowgroups, the deletes can be isolated to a smaller number of rowgroups.
153
Chapter 10 Segment and Rowgroup Elimination
154
Chapter 10 Segment and Rowgroup Elimination
Only querying for columns A and B allows columns C–H to be automatically skipped
and all segments for those columns eliminated (36 segments in total). Querying for a
narrow date range that only requires rows in rowgroup 4 allows rowgroups 1–3 and 5–6
to also be automatically skipped, eliminating another 10 segments. The result is a query
that required only 2 out of the possible 48 segments in the table!
Combining segment and rowgroup elimination allows columnstore index queries to
scale effectively, even as data grows larger or as more columns are added to the table.
155
CHAPTER 11
Partitioning
As an analytic table grows in size, it becomes apparent that newer data is read far more
often than older data. While columnstore metadata and rowgroup elimination provide
the ability to quickly filter out large amounts of columnstore index data, managing a
table with millions or billions of rows can become cumbersome.
Equally important is the fact that older data tends to not change often. For a typical
fact table, data is added onto the end of it in the order it is created, whereas older data
remains untouched. If older data is modified, it is usually the result of software releases,
upgrades, or other processes that fall squarely into the world of the data architect
(that’s us!) to manage.
Table partitioning is a natural fit for a clustered columnstore index, especially if
it is large. Partitioning allows data to be split into multiple filegroups within a single
database. These filegroups can be stored in different data files that reside in whatever
locations are ideal for the data contained within them. The beauty of partitioning is
that the table is logically unchanged, with its physical structure being influenced by the
details of how it partitioned. This means that application code does not need to change
in order to benefit from it. There are many benefits to partitioning, each of which is
described in this chapter.
157
© Edward Pollack 2022
E. Pollack, Analytics Optimization with Columnstore Indexes in Microsoft SQL Server,
https://doi.org/10.1007/978-1-4842-8048-5_11
Chapter 11 Partitioning
Figure 11-1. How partitioning can influence storage speed and cost
158
Chapter 11 Partitioning
months prior to the migration. ETL or similar processes could be used on the day of the
migration to catch up the target database with new data prior to permanently cutting
over from the old data source to the new one.
Figure 11-2 illustrates the difference between migrating a large/monolithic table vs. a
partitioned one.
Partitioning opens up the ability to subdivide the migration logically knowing that
the physical storage of the table will facilitate the ability to copy/move each file one by
one when needed. Instead of having to move a terabyte of data all at once or being forced
to write ETL against the entire table, the table can be subdivided into smaller pieces,
each of which is smaller and easier to move.
P
artition Elimination
The logical definitions for each partition are not solely used for storage purposes. They
also assist the query optimizer and allow it to automatically skip data when the filter in
a query aligns with the partition function. For a query against a columnstore index, its
filter can allow the metadata for rowgroups outside of the target partition to be ignored.
159
Chapter 11 Partitioning
For example, if a columnstore index contains data ranging from 2010 through 2021
and is partitioned by year (with a single partition per year), then a query requesting rows
from January 2021 would be able to automatically skip all rowgroups in partitions prior
to 2021. Figure 11-3 shows how partition elimination can reduce reads and further speed
up queries against columnstore indexes.
D
atabase Maintenance
Some database maintenance tasks, such as backups and index maintenance, can be
executed on a partition-by-partition basis. Since the data in older partitions rarely
changes, it is less likely to require maintenance as often as newer data. Therefore,
160
Chapter 11 Partitioning
maintenance can be focused on the specific partitions that need it most. This also means
that maintenance speeds can be greatly improved by no longer needing to operate on
the entire table at one time.
In addition, how data is managed can vary partition to partition. Some ways in which
data can be handled differently depending on the partition include
P
artitioning in Action
To visualize how partitioning works, a new version of the test table Fact.Sale_CCI will be
created. This version will be ordered by Invoice Date Key and also partitioned by year.
Each partition needs to target a filegroup, which in turn will contain a data file. For this
demonstration, a new filegroup and file will be created for each year represented in
the table.
Listing 11-1 shows how new filegroups can be created that will be used to store data
for the partitioned table.
Listing 11-1. Script That Creates a New Filegroup for Each Year of Data in
the Table
ALTER DATABASE WideWorldImportersDW ADD FILEGROUP
WideWorldImportersDW_2013_fg;
ALTER DATABASE WideWorldImportersDW ADD FILEGROUP
WideWorldImportersDW_2014_fg;
161
Chapter 11 Partitioning
Once executed, the presence of the new database filegroups can be confirmed by
checking the Filegroups menu within the database’s properties, as seen in Figure 11-4.
Figure 11-4. New filegroups that will be used to store partitioned data
While the five new filegroups are present in the database, they contain no files. The
next step in this process is to add files to these filegroups (one file each). Listing 11-2
contains the code needed to add these files.
162
Chapter 11 Partitioning
For this demonstration, each file has the same size and growth settings, but for a
real-world table, these numbers will vary. By inspecting the size of the data that is to
be partitioned, the amount of space needed for each year should be relatively easy to
calculate. Note that once a table’s data is migrated to a partitioned table, the data files
that used to contain its data will now have additional free space that can be reclaimed, if
needed. Figure 11-5 shows the new files listed under the Files menu within the database
properties.
Figure 11-5. New database files that will be used to store partitioned data
163
Chapter 11 Partitioning
New database files can be placed on any storage available to the server. This is
where the table’s data can be customized to meet whatever SLAs it is subject to. For
this example, if data from prior to 2016 is rarely accessed, it could be placed into files
on slower storage. Similarly, more recent data could be maintained on faster storage to
support frequent analytics.
The next step in configuring partitioning is to determine how to slice up data from
the table into the newly created files. This is achieved using a partition function. When
working with a columnstore index, ensure that the column that the table is ordered by
is also the same data type used for the partition function. This convention is the same
with clustered rowstore indexes as well. If the data type for the partition function does
not match the target column to order by in the table, an error will be thrown when the
table is created. For this demonstration, the partition function will split up data using the
DATE data type, which corresponds to the Invoice Date Key column that Fact.Sale_CCI is
ordered by, as seen in Listing 11-3.
RANGE RIGHT specifies that the boundaries created will be defined using the dates
provided as starting points. The result is that data will be divided into five buckets,
like this:
If used, RANGE LEFT would result in date ranges where the inequality operators
are adjusted so that boundaries are checked with greater than and less than or equal to,
rather than what was presented earlier. If unsure of which to use, consider implementing
RANGE RIGHT for time-based dimensions as it is typically a more natural division that
164
Chapter 11 Partitioning
cleanly divides up units of months, quarters, years, etc. The following list shows how the
boundaries would be defined in the partition function in Listing 11-3 if RANGE LEFT
were used:
Note that the partition function will always specify one less boundary than there
are filegroups. In this example, the partition function provides four dates that form date
boundaries that define five distinct date ranges which are subsequently mapped onto
the partition scheme and assigned filegroups. If the number of boundaries provided by
the partition function is not one less than the partition scheme, an error will be returned,
similar to what is seen in Figure 11-6.
165
Chapter 11 Partitioning
Figure 11-6. Error received if partition scheme contains too few filegroup entries
The error is verbose enough to remind the user that the function generates too
many or too few partitions than are afforded by the partition scheme. By writing out the
time periods desired for the target table, the task of generating a partition function and
partition scheme become far simpler.
With database files and filegroups defined, as well as a partition function and
scheme, the final step to implementing partitioning is to create a table using the newly
created partition scheme. Listing 11-5 creates a new version of Fact.Sale and partitions it
using the Invoice Date Key column on fact_Sale_CCI_years_scheme.
166
Chapter 11 Partitioning
Figure 11-7. Error received if the data type of the partition column and partition
function do not match
Data types between the partition function and column must match exactly. DATE
and DATETIME are not compatible, nor are other data types that may seem similar. SQL
Server does not automatically convert between data types when evaluating partition
functions and will instead throw an error when the table is created.
Once created, Fact.Sale_CCI_PARTITIONED is managed in the same fashion as
Fact.Sale_CCI_ORDERED was:
167
Chapter 11 Partitioning
When complete, two tables will exist for demonstration purposes that are identical,
except that one is partitioned and the other is not. Both are ordered by Invoice Date Key
and will benefit from rowgroup elimination whenever filtered by that column. Listing 11-7
shows two queries that aggregate Quantity for a single month against each table.
168
Chapter 11 Partitioning
SELECT
SUM([Quantity])
FROM Fact.Sale_CCI_PARTITIONED
WHERE [Invoice Date Key] >= '1/1/2016'
AND [Invoice Date Key] < '2/1/2016';
While the queries are identical aside from the table name, the STATISTICS IO output
illustrates the differences in execution between each, as seen in Figure 11-8.
The output of STATISTICS IO shows multiple ways in which query execution varied
for each table. The most significant difference is in the reported segment reads. The
nonpartitioned table read 2 segments while skipping 22, whereas the partitioned table
read 1 segment while skipping 3. This IO reflects the original query that calculates a sum
using only rows with an Invoice Date Key within January 2016.
For the table that is ordered and not partitioned, metadata needs to be reviewed
from the entire table prior to using rowgroup elimination to skip segments. In
the partitioned table, though, rowgroups that do not contain data from 2016 are
169
Chapter 11 Partitioning
Figure 11-9. Query execution plans for a nonpartitioned columnstore index vs. a
partitioned columnstore index
The execution plan for the nonpartitioned table is more complex as SQL Server
determines that parallelism may help in processing the large number of rows. The
execution plan for the partitioned table is simpler, as the optimizer chooses to not use
parallelism. Note that the use of a hash match is to support the aggregate pushdown
for the SUM operator into the columnstore index scan step. Also interesting to note is
that the count of rows read in the execution plan is lower on a partitioned table (vs. a
nonpartitioned table) when less partitions need to be read.
While the output of each query is identical and each execution plan produces the
same result set, the ability to forgo parallelism saves computing resources, as is implied
by the greatly reduced query cost for the partitioned table.
In addition to partition elimination, index maintenance can be adjusted to skip
older, less updated partitions. For example, if a rebuild was deemed necessary for this
columnstore index to clean it up after some recent software releases, the nonpartitioned
table would need to be rebuilt en masse, as seen in Listing 11-8.
170
Chapter 11 Partitioning
The rebuild operation takes 61 seconds to complete. For a larger columnstore index,
it could be significantly longer. If the portion of data impacted by the software release
is limited to newer data only, it is very likely that only the most current partition needs
to be rebuilt. Listing 11-9 shows how an index rebuild can be executed against a single
partition.
This rebuild only takes 1 second. This is because the most recent partition does not
contain a full year of data. To provide a fairer assessment of rebuild times, Listing 11-10
shows a rebuild operation against a partition that contains a full year of data.
171
Chapter 11 Partitioning
An alternative to this would be to switch out the partition, modify the data, and then
reinsert the data back into the table. This allows the data that is to be modified to be
isolated prior to making any changes. The script in Listing 11-11 creates a new staging
table with the same schema as Fact.Sale_CCI_PARTITIONED that will be used as the
target for a partition switch.
Listing 11-11. Script to Create a Staging Table for Use in a Partition Switching
Operation
CREATE TABLE Fact.Sale_CCI_STAGING
( [Sale Key] [bigint] NOT NULL,
[City Key] [int] NOT NULL,
[Customer Key] [int] NOT NULL,
[Bill To Customer Key] [int] NOT NULL,
[Stock Item Key] [int] NOT NULL,
[Invoice Date Key] [date] NOT NULL,
[Delivery Date Key] [date] NULL,
[Salesperson Key] [int] NOT NULL,
[WWI Invoice ID] [int] NOT NULL,
[Description] [nvarchar](100) NOT NULL,
[Package] [nvarchar](50) NOT NULL,
[Quantity] [int] NOT NULL,
[Unit Price] [decimal](18, 2) NOT NULL,
[Tax Rate] [decimal](18, 3) NOT NULL,
[Total Excluding Tax] [decimal](18, 2) NOT NULL,
[Tax Amount] [decimal](18, 2) NOT NULL,
[Profit] [decimal](18, 2) NOT NULL,
[Total Including Tax] [decimal](18, 2) NOT NULL,
[Total Dry Items] [int] NOT NULL,
[Total Chiller Items] [int] NOT NULL,
[Lineage Key] [int] NOT NULL)
ON WideWorldImportersDW_2016_fg;
CREATE CLUSTERED COLUMNSTORE INDEX CCI_fact_Sale_CCI_STAGING ON
Fact.Sale_CCI_STAGING;
172
Chapter 11 Partitioning
Note that the staging table is created on the same filegroup as the source data that
is to be switched. If desired, the staging table could be created using the same partition
scheme as Fact.Sale_CCI_PARTITIONED, which would allow the partition number to be
specified in the partition switch, rather than having to explicitly provide a filegroup in
the table create statement. The syntax is arbitrary and can be left to the convenience of
the operator as to which is easier to implement.
Once the staging table is created, a partition on the 2016 filegroup can be switched
from the source table into the staging table using the script in Listing 11-12.
Listing 11-13. Validating the Row Count in the Staging Table After the Partition
Switch Operation Completes
SELECT COUNT(*) FROM Fact.Sale_CCI_STAGING;
The result of the count query shows that over 3 million rows were switched from
Fact.Sale_CCI_PARTITIONED to the staging table. From here, data can be modified in
the staging table as needed to resolve whatever the identified issues are. Once the data is
cleaned up, it can be inserted back into the original columnstore index, ensuring that the
resulting data is clean with no fragmentation. Listing 11-14 shows a sample process for
how this data modification could occur.
173
Chapter 11 Partitioning
Listing 11-14. Example of Data Modification Using Staging Data. Data Is Moved
Back into the Parent Table Using an INSERT Operation
UPDATE Sale_CCI_STAGING
SET [Total Dry Items] = [Total Dry Items] + 1
FROM Fact.Sale_CCI_STAGING;
174
Chapter 11 Partitioning
Listing 11-15. Example of Data Modification Using Staging Data. Data Is Moved
Back into the Parent Table Using Partition Switching
UPDATE Sale_CCI_STAGING
SET [Total Dry Items] = [Total Dry Items] + 1
FROM Fact.Sale_CCI_STAGING;
This alternative code will be significantly faster as the partition switch is a speedy,
minimally logged DDL operation, whereas the large INSERT in Listing 11-14 needs to
incur the cost of writing all of this data back to Fact.Sale_CCI_Partitioned.
Partition switching is a versatile tool that can be used in a wide variety of data
modification, archival, and deletion scenarios. Its use will vary between applications,
but provides a fast way to shift data in and out of tables without the fragmentation and
expense of a large write operation.
P
artitioning Guidelines
While partitioning may sound like a win for any columnstore index, it should not be
automatically implemented without research and foresight. Partitioning works well
in larger tables, but does not provide value to small columnstore indexes. Therefore,
capacity planning should be a step in the decision-making process to determine whether
partitioning is a good fit or not for any given table.
175
Chapter 11 Partitioning
Rowgroups cannot span partitions, though. The new partitioned table would contain
ten partitions, each of which consists of a single rowgroup. The result is a table with ten
rowgroups instead of five.
Generally speaking, partitioning is not appropriate for a columnstore index unless
the table contains at least tens or hundreds of millions of rows. Equally important,
partitions need to be large enough to facilitate the use of filled-up rowgroups. Therefore,
the partition function and partition scheme need to ensure that each partition contains
at least 220 (1,048,576) rows each. Undersized partitions will result in undersized
rowgroups, which will lead to fragmentation over time.
The larger a columnstore index, the more it will benefit from partitioning. Aligning
partition boundaries to organizational needs can also help in determining how to
implement partitioning on a columnstore index. If an organization archives a billion
rows of data each quarter, then partitioning by quarter makes perfect sense. If instead,
the archival is organized by year, then partitioning by year would be more relevant.
Storage Choice
If data within a columnstore index is accessed differently, the data files used for the
partitioned table can mirror that usage. If older data is rarely accessed and can tolerate
more latency, then it can be moved onto slower/less expensive hardware. If newer data
needs to be highly available with minimal latency, then it can be placed on faster and
more expensive storage.
As a result, a large table can be split up to save money. Every terabyte of data that
lands on cheaper storage represents cost savings that can be easily quantified. While
partitioning’s primary purpose is not to save money, the ability to shift workloads via the
strategic placement of data files can achieve this without significant effort.
176
Chapter 11 Partitioning
When building a partitioned table, identify if the table has data that follows different
usage patterns and assign data files based on that usage to slower or faster storage,
when possible. If performance requirements change with time, data files can be moved
between storage locations to ensure that SLAs are still met, even if data that was once
rarely needed becomes critical to frequent analytic processes.
Additional Benefits
One of the greatest benefits of table partitioning is that it requires no application code
changes. All partitioning structures are internal to SQL Server and have no bearing
on an application beyond the performance experienced as the table is accessed. This
also means that partitioning can be tested out for analytic data and kept or rolled
back depending on the results of that testing. A “rollback” would consist of creating a
second copy of data and swapping it into the production location, but is a reasonable
process with analytic data where data load processes are well within the confines of data
architecture. This allows partitioning to be tested with minimal impact on the code that
developers write to consume this data.
Partitioning is an optional step when implementing a columnstore index, but can
improve maintenance, speed up analytic workloads, and potentially save money. This
feature should be targeted at larger tables with at least tens of millions of rows and that
are expected to grow rapidly over time.
177
CHAPTER 12
Nonclustered
Columnstore Indexes
on Rowstore Tables
Clustered columnstore indexes provide a primary storage mechanism for analytic data.
For tables that are intended for use as OLAP data sources, this is the optimal choice and
will provide a data structure that facilitates fast and efficient reads and writes against
analytic data.
For tables that are primarily transactional in nature, but also have analytic queries
executed against them, a clustered columnstore index is not appropriate. There are a
handful of options available to manage those additional analytic workloads, including
Whereas a purely OLTP or OLAP workload can be managed with purely transactional
or analytic data structures, a mixed workload is more complex and requires a more
careful inspection of reads and writes to understand how to best manage it. This chapter
will discuss each alternative and when they are most appropriate, providing guidance on
how to choose the best storage methodology for a given workload.
179
© Edward Pollack 2022
E. Pollack, Analytics Optimization with Columnstore Indexes in Microsoft SQL Server,
https://doi.org/10.1007/978-1-4842-8048-5_12
Chapter 12 Nonclustered Columnstore Indexes on Rowstore Tables
Figure 12-1. The challenge of OLAP and OLTP queries against rowstore tables
While mixing analytic and transactional needs works for small, less busy tables,
challenges are guaranteed to arise when usage or size becomes significant. As a rule,
it is generally advisable to avoid relying on rowstore tables for analytics if the table is
expected to grow large.
Covering indexes allow for a rowstore index to completely cover an analytic query.
This can offer improvement from clustered index scans and key lookups and is a good
solution when analytics are limited in scope to a handful of well-understood queries.
For example, consider the query in Listing 12-1.
180
Chapter 12 Nonclustered Columnstore Indexes on Rowstore Tables
SELECT
COUNT(*) AS sale_count,
COUNT(DISTINCT SalespersonPersonID) AS sales_people_count,
SUM(CAST(IsUndersupplyBackordered AS INT)) AS
undersupply_backorder_count
FROM Sales.Orders
WHERE CustomerID = 90
AND OrderDate >= '1/1/2015'
AND OrderDate < '1/1/2016';
This query is a common example of analytics against transactional data. Figure 12-2
shows the execution plan for this query.
Figure 12-2. Execution plan for an analytic query against a transactional table
The execution plan shows an index seek that filters on the index for CustomerID. The
key lookup retrieves the other columns from the clustered index that are needed
to satisfy the query. The IO for the query as found in STATISTICS IO is shown in
Figure 12-3.
181
Chapter 12 Nonclustered Columnstore Indexes on Rowstore Tables
Note that the key lookup incurred quite a bit of logical IO as SQL Server had to
return to the clustered index to retrieve the OrderDate, SalespersonPersonID, and
IsUndersupplyBackordered columns. If this query is executed often and does not vary
significantly in form, a covering index can be an adequate way to manage it. Listing 12-2
creates an index that covers this sample query.
Listing 12-2. Nonclustered Rowstore Index That Fully Covers an Analytic Query
After creating this new covering index, the execution plan for the test query has been
simplified to what is seen in Figure 12-4.
Figure 12-4. Execution plan for an analytic query using a rowstore covering index
The execution plan shows an index seek, rather than a seek plus a key lookup, as was
the case before. The updated output of STATISTICS IO is shown in Figure 12-5.
Figure 12-5. STATISTICS IO for an analytic query using a rowstore covering index
The new STATISTICS IO output shows a seven times decrease in IO, from 470 reads
to 67. The covering index provides a significant performance boost to the analytic query,
and as an added bonus can reduce contention as the new index is a separate object that
can be read separately from the clustered index.
Covering indexes are great tools for analytic queries that are specific, consistent, and
few in number. There is a danger in creating excessive numbers of covering indexes to try
182
Chapter 12 Nonclustered Columnstore Indexes on Rowstore Tables
and keep up with new analytic queries. Over-indexing is a legitimate problem that can
hamper write performance, waste storage, and consume significant memory. Similarly,
covering indexes that contain too many columns are destined to waste space. For all
intents and purposes, a covering index with too many columns can quickly devolve into
a copy of the table, which is expensive to maintain.
The ideal scenarios where a covering index can be useful can be generalized as
follows:
Figure 12-6 shows a simple representation of workloads against an OLAP data copy.
183
Chapter 12 Nonclustered Columnstore Indexes on Rowstore Tables
While separating transactional and analytic data is by far the most effective way of
managing the challenge of mixed workloads, it is also the most disruptive.
AlwaysOn and Replication are mature tools that provide different ways to produce
secondary copies of data for use by analytic processes. They require new components
of SQL Server to be learned and implemented, as well as a commitment for additional
hardware to support a new target for copies of OLTP data. They can also provide high
availability, in the event that the primary transactional database fails.
If a database already uses AlwaysOn Availability Groups, then nonclustered
columnstore indexes can be leveraged to off-load analytic queries to a readable
secondary. Similarly, if Replication is used, the Replication target can be outfitted with a
clustered or nonclustered columnstore index, if its primary purpose is to service analytic
queries.
In addition to these built-in tools, an architect can create their own data copies and
manage them via SQL Server Integration Services or some other data movement process.
While manually building a data-copy process is a bit more labor intensive, it allows
for the use of existing tools and for any amount of customization. This is often how
organizations start their data warehousing environments.
Many third-party tools will perform similar tasks and allow for additional
processing/management of data. The upside to investing in a tool by an outside
organization is that the time, resources, and expertise needed to build and maintain it
fall squarely on their shoulders, freeing up resources for other projects. The downsides
are cost and investment in a new tool that may potentially be difficult to quit in the future
if needed.
184
Chapter 12 Nonclustered Columnstore Indexes on Rowstore Tables
While solutions such as these incur latency, the latency can be controlled via
configuration settings within each of them so that it remains within the bounds of
organizational needs.
185
Chapter 12 Nonclustered Columnstore Indexes on Rowstore Tables
The takeaway from having both rowstore and columnstore indexes in a single table is
that both can be used simultaneously, allowing transactional and analytic operations to
execute side by side, rather than exclusively competing for resources.
To test the impact of nonclustered columnstore indexes on a transactional table,
consider the script in Listing 12-3.
In the index creation statement, only columns involved in the analytic query were
included. More can be added if needed for other common queries, but the goal is to
ensure that analytics are used by the columnstore index without the need to refer to the
clustered rowstore index.
Once added, the test query in Listing 12-1 can be executed again. The execution plan
is shown in Figure 12-8.
186
Chapter 12 Nonclustered Columnstore Indexes on Rowstore Tables
The execution plan shows that the columnstore index was exclusively used to return
query results. Figure 12-9 shows the output of STATISTICS IO for the query.
Reads are about the same as with a covering index, with the added benefit that key
lookups to the clustered index are avoided, ensuring no contention with other queries
that are using it at the same time.
187
Chapter 12 Nonclustered Columnstore Indexes on Rowstore Tables
Compression Delay
Transactional data is often written and read frequently for a short span of time, after
which it is subject to less and less data access over time. The new data can be viewed as
hot data, whereas data that is somewhat old can be viewed as warm data, and older data
can be seen as cold data. While the table is transactional in nature, a vast majority of data
access will involve only the hot data.
For example, consider an online order tracking system. The following is a summary
of some of the steps involved in the order creation, processing, and completion, as well
as notes on data access:
8. User accesses this data read-only via the order history list in
the future.
188
Chapter 12 Nonclustered Columnstore Indexes on Rowstore Tables
The index will initially perform exactly as it did before. The difference will be when
data is written to the delta store. Any rows inserted into the delta store will be retained
for at least 10 minutes prior to writing to compressed rowgroups. Without compression
delay, the tuple mover will periodically move rows from the delta store to compressed
rowgroups With this feature, it will now wait at least 10 minutes before doing so.
The compression delay may be adjusted on an existing index, if needed, without the
need for a rebuild. For example, if it is determined that data is hot for 30 minutes, rather
than 10, an ALTER INDEX statement can adjust the compression delay accordingly, as
seen in Listing 12-5.
189
Chapter 12 Nonclustered Columnstore Indexes on Rowstore Tables
1. Measure new rows inserted over time. Ideally, the delta store
would contain no more than a few rowgroups worth of data (a few
million or less) that are not used heavily for analytics.
Compression delay can be a large or small number. If a transactional table has 1,000
new rows inserted per hour and those rows are heavily modified for 6 hours, then a
compression delay of 360 minutes would be perfectly acceptable. In that scenario, the
delta store would contain an average of 6,000 rows, which is small enough that scanning
it would not be painful.
Alternatively, if a table has 2,000,000 rows inserted per hour, then compression delay
would need to be more restrictive. If data in that table is hot for 2 hours, then an ideal
compression delay value would be somewhere in the range of 30–120 minutes. Testing
would be required to balance the needs of write processes vs. those of analytic read
processes. The diagram in Figure 12-10 shows how compression delay impacts the flow
of data for transactional and analytic queries with a nonclustered columnstore index.
190
Chapter 12 Nonclustered Columnstore Indexes on Rowstore Tables
Figure 12-10. Data flow for OLAP and OLTP queries against a nonclustered
columnstore index with compression delay
The goal represented in this data flow is for transactional queries to target the
rowstore indexes and the delta store, whereas analytic queries target the columnstore
index. Compression delay helps to ensure that OLTP writes are not impacting the
performance of the compressed rowgroups within the nonclustered columnstore index.
OLTP reads would be expected to target the rowstore indexes and would rarely make use
of the columnstore index.
Compression delay can be applied to clustered columnstore indexes as well using
the same syntax. This can be an excellent solution when data load processes fulfill one of
these criteria:
191
Chapter 12 Nonclustered Columnstore Indexes on Rowstore Tables
SELECT
tables.name AS table_name,
indexes.name AS index_name,
indexes.type_desc AS index_type,
indexes.compression_delay
FROM sys.indexes
INNER JOIN sys.tables
ON tables.object_id = indexes.object_id
WHERE indexes.type_desc IN ('NONCLUSTERED COLUMNSTORE', 'CLUSTERED
COLUMNSTORE');
Row 4 (circled) shows the index that was altered by the query in Listing 12-6 and
confirms the compression delay setting of 60 minutes.
Fragmentation due to rows being deleted can be quantified using the percentage of
rows being deleted, using the query shown in Listing 12-8.
SELECT
objects.name,
partitions.partition_number,
dm_db_column_store_row_group_physical_stats.row_group_id,
dm_db_column_store_row_group_physical_stats.total_rows,
dm_db_column_store_row_group_physical_stats.deleted_rows,
CAST(100 * CAST(deleted_rows AS DECIMAL(18,2)) / CAST(total_rows AS
DECIMAL(18,2)) AS DECIMAL(18,2)) AS percent_deleted
FROM sys.dm_db_column_store_row_group_physical_stats
INNER JOIN sys.objects
ON objects.object_id = dm_db_column_store_row_group_physical_stats
.object_id
INNER JOIN sys.partitions
ON partitions.object_id = objects.object_id
193
Chapter 12 Nonclustered Columnstore Indexes on Rowstore Tables
This query specifically targets a single table (Sale_CCI) and returns the total rows,
deleted rows, and percent deleted per rowgroup, as seen in Figure 12-12.
194
Chapter 12 Nonclustered Columnstore Indexes on Rowstore Tables
This index will only include rows that happen to have a value for the
PickedByPersonID column. As a result, rows will not be inserted into the columnstore
index until they have been boxed and are ready to ship. Because of this, any data
manipulation that occurs prior to this point in time will not impact the performance of
the columnstore index.
Listing 12-10 calculates the number of rows that meet the filter criteria vs. the
number that do not.
Listing 12-10. Calculating the Number of Rows That Meet the Filtered Index
Criteria
SELECT
SUM(CASE WHEN PickedByPersonID IS NULL THEN 1 ELSE 0 END) AS
orders_not_picked,
SUM(CASE WHEN PickedByPersonID IS NOT NULL THEN 1 ELSE 0 END) AS
orders_picked
FROM Sales.Orders
195
Chapter 12 Nonclustered Columnstore Indexes on Rowstore Tables
The results in Figure 12-13 show that about 14% of the rows in the table are not
picked and therefore are still classified as hot data.
Figure 12-13. Count of rows that meet the filtered index criteria
Typically, hot data will constitute a small fraction of data in the table. If a simple
filter can be used to determine whether data is hot or not, then it can be applied to
the nonclustered columnstore index to ensure that its performance is not negatively
impacted by it.
Consider the analytic query presented in Listing 12-1. If this query was intended
to only target data that is no longer hot, then adding the filter criteria used on the
nonclustered columnstore index to this query would allow it to use the filtered version of
the index, as seen in Listing 12-11.
SELECT
COUNT(*) AS sale_count,
COUNT(DISTINCT SalespersonPersonID) AS sales_people_count,
SUM(CAST(IsUndersupplyBackordered AS INT)) AS
undersupply_backorder_count
FROM Sales.Orders
WHERE CustomerID = 90
AND OrderDate >= '1/1/2015'
AND OrderDate < '1/1/2016'
AND PickedByPersonID IS NOT NULL;
196
Chapter 12 Nonclustered Columnstore Indexes on Rowstore Tables
Listing 12-12. Query That Lists Indexes with Filters in a Given Database
SELECT
indexes.name,
indexes.type_desc,
indexes.filter_definition
FROM sys.indexes
WHERE indexes.has_filter = 1;
The results of the query are useful in understanding what filters exist and their
definitions, as shown in Figure 12-15.
If the database contains many filtered indexes, the query could be further refined to
omit rowstore indexes or other noise.
197
Chapter 12 Nonclustered Columnstore Indexes on Rowstore Tables
There is a single pitfall in filtered indexes that needs to be reviewed prior to using
them, and that is filter column sensitivity. Ideally, as data moves from hot to warm
to cold, it is added to the filtered nonclustered columnstore index and rarely (if ever)
removed. If rows are capable of alternating between satisfying the filter clause and then
no longer meeting the criteria, there is a danger that a single row in a table could be
added and removed from the columnstore index many times. If this is common, then the
filtered index may end up being as fragmented as a nonfiltered index would be.
Consider a scenario using the filtered nonclustered columnstore index created in
Listing 12-9. If a row were assigned a value for PickedByPersonID, it would immediately
be inserted into the delta store for the columnstore index where it would await
compression via the tuple mover. If PickedByPersonID is then reset to NULL, it would
be deleted from the columnstore index as it no longer meets the filter criteria. When
designing any filtered index, there is value in ensuring that rows will not move in and out
of the index often and that data follows a one-way journey from hot to warm to cold.
Compression delay can be combined with a filtered nonclustered columnstore index
as a way to buffer out-of-band changes to data that cause it to move between hot and
warm via routine processes. Listing 12-13 shows a filtered nonclustered columnstore
index that includes compression delay.
This index provides additional buffering against the possibility that rows that have
been assigned a PickedByPersonID might be
• Deleted
• Updated
• Have PickedByPersonID set back to NULL
A transactional table may make use of filtered nonclustered columnstore indexes,
compression delay, or both and do so with success. The details as to which features to
use and how to configure them should be driven by organizational need and its data life
198
Chapter 12 Nonclustered Columnstore Indexes on Rowstore Tables
cycle. There is no one-size-fits-all solution in making these decisions. One table in one
database for organization may benefit from a compression delay of 10 minutes, whereas
1,440 minutes might be optimal for another table. Filters might fully encapsulate a key
use case in one workflow when others may be unable to easily be filtered for in an index.
C
ode Changes
If compression delay and filtered nonclustered columnstore indexes are unable
to effectively manage analytic workloads against an OLTP table, it is possible that
organizational logic, code, or both require review.
If application or database code can be improved in a way that supports both analytic
and transactional workloads against the same table, then making those changes can
allow an organization to prevent having to architect a more complex and expensive
OLAP solution.
Ultimately, a solution exists for any analytic data challenge, but the ideal solution
will be the one that costs the fewest resources while being capable of scaling most easily
into the future. Real-time operational analytics are not appropriate for all workloads, and
there will often be scenarios where a more expensive and resource-intensive solution
(such as AlwaysOn or Replication) is the correct solution.
199
Chapter 12 Nonclustered Columnstore Indexes on Rowstore Tables
This insert operation doubles the size of the table. Once complete, the query in
Listing 12-15 can be executed to confirm the current state of rowgroups within the
nonclustered columnstore index.
200
Chapter 12 Nonclustered Columnstore Indexes on Rowstore Tables
dm_db_column_store_row_group_physical_stats.row_group_id,
dm_db_column_store_row_group_physical_stats.has_vertipaq_
optimization,
dm_db_column_store_row_group_physical_stats.state_desc
FROM sys.dm_db_column_store_row_group_physical_stats
INNER JOIN sys.objects
ON objects.object_id = dm_db_column_store_row_group_physical_stats.
object_id
INNER JOIN sys.partitions
ON partitions.object_id = objects.object_id
AND partitions.partition_number = dm_db_column_store_row_group_physical_
stats.partition_number
AND partitions.index_id = dm_db_column_store_row_group_physical_stats.
index_id
WHERE objects.name = 'Orders'
ORDER BY dm_db_column_store_row_group_physical_stats.row_group_id;
The results in Figure 12-16 show both the existing compressed rowgroup and the
newly created open rowgroup.
Data in the new rowgroup still resides in the delta store as a compression delay of
30 minutes was specified when the columnstore index was created. To avoid waiting 30
minutes for the rowgroup to be processed by the tuple mover, the T-SQL in Listing 12-16
will be used to enable that process immediately.
Listing 12-16. T-SQL to Enable the Tuple Mover to Process Open Delta
Rowgroups
ALTER INDEX NCCI_Orders ON sales.Orders REORGANIZE WITH
(COMPRESS_ALL_ROW_GROUPS = ON);
201
Chapter 12 Nonclustered Columnstore Indexes on Rowstore Tables
The results in Figure 12-17 show the state of the columnstore index immediately
after this script completed.
Note that the newly created rowgroup (row_group_id = 2) is compressed and is using
Vertipaq optimization. If the metadata script in Listing 12-15 is executed again after a few
minutes pass, then the rowgroup in the TOMBSTONE state will be removed, as shown in
Figure 12-18.
The result of the INSERT operation is 2 rowgroups, one created alongside the initial
creation of the columnstore index and the other created when a new set of rows were
inserted and compressed. As shown in this example, Vertipaq optimization will be used
for nonclustered columnstore indexes. This is a boon to columnstore compression and
ensures that each segment is compressed as efficiently as possible, despite representing
transactional data that may be subject to change more frequently than typical
analytic tables.
It is the inclusion of the clustered rowstore index key columns in the nonclustered
columnstore index that enables Vertipaq optimization to be used. Without the key
columns included, linking rows within the columnstore index back to the rowstore
index would be impossible and the columnstore index would need to be rebuilt
whenever data is modified. Such a cost would be prohibitively high. Having clustered
202
Chapter 12 Nonclustered Columnstore Indexes on Rowstore Tables
key columns available ensures that rows within the nonclustered columnstore index
can be freely reordered without any danger that they could not easily be tied back to the
clustered index.
203
Chapter 12 Nonclustered Columnstore Indexes on Rowstore Tables
Listing 12-17. Script That Returns Index Usage Data for One Table
SELECT
tables.name AS TableName,
indexes.name AS IndexName,
dm_db_index_usage_stats.user_seeks,
dm_db_index_usage_stats.user_scans,
dm_db_index_usage_stats.user_lookups,
dm_db_index_usage_stats.user_updates,
dm_db_index_usage_stats.last_user_seek,
dm_db_index_usage_stats.last_user_scan,
dm_db_index_usage_stats.last_user_lookup,
dm_db_index_usage_stats.last_user_update
FROM sys.dm_db_index_usage_stats
INNER JOIN sys.tables
ON tables.object_id = dm_db_index_usage_stats.object_id
INNER JOIN sys.indexes
ON indexes.object_id = dm_db_index_usage_stats.object_id
AND indexes.index_id = dm_db_index_usage_stats.index_id
WHERE tables.name = 'Orders'
204
Chapter 12 Nonclustered Columnstore Indexes on Rowstore Tables
205
CHAPTER 13
Nonclustered Rowstore
Indexes on Columnstore
Tables
Clustered columnstore indexes provide effective enough compression and data access
speeds that most typical analytic workloads will not require any other indexes to provide
adequate performance.
The key to this performance lies in the underlying data order for a given analytic
table. As long as queries using that data are able to filter based on the dimension that
the data is ordered by (typically time), they can benefit from rowgroup elimination. For
unusual queries that slice the data by other dimensions, the result will often be a full
scan of the columnstore index. For large tables, this is prohibitively expensive and will
necessitate another solution to help in optimizing these workloads.
Listing 13-1. Query That Filters and Aggregates Across the Time Dimension of
an Ordered Clustered Columnstore Index
SELECT
COUNT(*),
SUM(Quantity) AS total_quantity
FROM Fact.Sale_CCI_ORDERED
WHERE [Invoice Date Key] >= '11/1/2015'
AND [Invoice Date Key] < '1/1/2016';
When executed, the result is returned quite quickly. Figure 13-1 shows the output of
STATISTICS IO.
Note that the segment report shows 4 segments read and 21 skipped. In addition, the
logical reads are quite low given that this table contains 25 million rows. Both of these
are indicators that rowgroup elimination was used effectively and helped ensure that
only a small slice of the table needed to be read in order to process the query and return
results.
So long as the analytic queries line up filters, ordering, and aggregation with the
column(s) that the table is ordered by, then performance like this is to be expected.
Consider the very different query provided in Listing 13-2.
208
Chapter 13 Nonclustered Rowstore Indexes on Columnstore Tables
Listing 13-2. Analytic Query That Does Not Use a Table’s Natural Ordering
SELECT
COUNT(*),
SUM(Quantity) AS total_quantity
FROM Fact.Sale_CCI_ORDERED
WHERE [Stock Item Key] = 186;
This query calculates quantity using Stock Item Key as the filter. Since the table
is ordered by Invoice Date Key, SQL Server has no natural way to achieve rowgroup
elimination. Figure 13-2 shows the STATISTICS IO output for this query.
This time, all rowgroups were read in the columnstore index and none were skipped.
This performance can be attributed to there being a total of 227 values for Stock Item Key
in the table, but those values being scattered across all rowgroups without any particular
order. Listing 13-3 provides the query to return metadata about the Stock Item Key
column contents in the columnstore index.
Listing 13-3. Query to Return Metadata About the Stock Item Key Column
SELECT
tables.name AS table_name,
indexes.name AS index_name,
columns.name AS column_name,
partitions.partition_number,
column_store_segments.segment_id,
column_store_segments.min_data_id,
column_store_segments.max_data_id,
column_store_segments.row_count
209
Chapter 13 Nonclustered Rowstore Indexes on Columnstore Tables
FROM sys.column_store_segments
INNER JOIN sys.partitions
ON column_store_segments.hobt_id = partitions.hobt_id
INNER JOIN sys.indexes
ON indexes.index_id = partitions.index_id
AND indexes.object_id = partitions.object_id
INNER JOIN sys.tables
ON tables.object_id = indexes.object_id
INNER JOIN sys.columns
ON tables.object_id = columns.object_id
AND column_store_segments.column_id = columns.column_id
WHERE tables.name = 'Sale_CCI_ORDERED'
AND columns.name = 'Stock Item Key'
ORDER BY tables.name, columns.name, column_store_segments.segment_id;
Figure 13-3 provides the output of this query, showing the minimum and maximum
values for Stock Item Key for each rowgroup.
Figure 13-3. Columnstore segment metadata for the Stock Item Key columns
It is readily apparent when viewing this segment metadata that each rowgroup has
similarly wide ranges of values for Stock Item Key, and therefore SQL Server is unable to
effectively use metadata to perform much filtering using this column. It is worth noting
that a query for a value greater than 219 would allow for some rowgroup elimination as
the max_data_id for some rowgroups is 219, rather than 227 or 225. Ignoring that small
detail, it is safe to say that most queries that filter on Stock Item Key will be forced to scan
all rowgroups in the columnstore index in order to return results.
If queries that filter on Stock Item Key are frequent and their current performance is
unacceptably slow, then one solution to this challenge is to implement a nonclustered
rowstore covering index on top of the columnstore index. To cover the query presented
210
Chapter 13 Nonclustered Rowstore Indexes on Columnstore Tables
here, it is necessary to order on Stock Item Key and include Quantity. The query in
Listing 13-4 creates this index.
The index creation query takes about 45 seconds to complete. Once complete,
executing the sample query in Listing 13-2 produces a new execution plan and new IO,
as seen in Figure 13-4.
Figure 13-4. Query execution plan and IO when using a covering index
SQL Server exclusively uses the covering index to execute the query, resulting
in 394 reads and an index seek against the new index. This IO is slightly higher than
the previous example, but that should not be the sole decision-making metric when
determining indexing strategy. The query executes exceptionally fast as well. There are a
few key benefits to using a nonclustered rowstore index:
• Query reads against the covering index do not create contention with
queries against the clustered columnstore index.
211
Chapter 13 Nonclustered Rowstore Indexes on Columnstore Tables
E nforcing Constraints
One common use of nonclustered rowstore indexes on clustered columnstore indexes
is not performance related. Nonclustered rowstore indexes may be used to enforce
uniqueness on a table, either via a primary key or by using a unique nonclustered index
definition.
Creating unique constraints is no different than on a clustered rowstore table.
Listing 13-5 shows the create statement for an existing nonclustered primary key over a
clustered columnstore index.
212
Chapter 13 Nonclustered Rowstore Indexes on Columnstore Tables
This new index serves as a primary key for the underlying table, both enforcing
uniqueness and allowing for foreign keys to be created on other tables that reference
this table. For a scenario where either of these are important architectural requirements,
then creating a primary key on the clustered columnstore index is a good solution to this
problem.
The table Fact.Sale contains many nonclustered rowstore indexes defined on it. The
query in Listing 13-6 returns the index size for each index on the table.
Listing 13-6. Query to Return Index Size for Each Index on Fact.Sale
SELECT
indexes.name AS Index_Name,
SUM(dm_db_partition_stats.used_page_count) * 8 Index_Size_KB,
SUM(dm_db_partition_stats.row_count) AS Row_Count
FROM sys.dm_db_partition_stats
INNER JOIN sys.indexes
ON dm_db_partition_stats.object_id = indexes.object_id
AND dm_db_partition_stats.index_id = indexes.index_id
INNER JOIN sys.tables
ON tables.object_id = dm_db_partition_stats.object_id
INNER JOIN sys.schemas
ON schemas.schema_id = tables.schema_id
WHERE schemas.name = 'Fact'
AND tables.name = 'Sale'
GROUP BY indexes.name
ORDER BY indexes.name
213
Chapter 13 Nonclustered Rowstore Indexes on Columnstore Tables
The results of the index size query are shown in Figure 13-5.
Oftentimes, OLTP queries against an analytic table may target only a portion of the
data – maybe newer, older, or based on some other useful filter. Consider the T-SQL that
was tested in Listing 13-2. This query filters on Stock Item Key, which is not the column
that then the columnstore index was ordered on. To mitigate the cost of this query, a
nonclustered rowstore index was added, as was shown in Listing 13-4. What if this query
never targeted new data, but exclusively was used to analyze older sales? Consider the
updated index in Listing 13-7.
When a query is executed that uses this index and filters for data prior to 1/1/2016,
it will use the filtered index rather than the columnstore index. Because the index is
smaller, the number of rows that are read and the IO will be less than for the unfiltered
version of this index.
One significant bonus of filtering a nonclustered rowstore index is that it allows
the index to target more specifically hot vs. warm vs. cold data. This provides multiple
benefits:
• Allows an index to target a specific use case, rather than all data
215
Chapter 13 Nonclustered Rowstore Indexes on Columnstore Tables
This can be time-consuming, especially for a larger table, but can allow columnstore
indexes to be periodically rebuilt to be smaller and more efficient than before.
Note that partitioning can be a valuable tool here as index maintenance can target
specific partitions. The query in Listing 13-8 shows how partitions can be filtered to show
only those that contain rowgroups lacking Vertipaq optimization.
216
Chapter 13 Nonclustered Rowstore Indexes on Columnstore Tables
Listing 13-8. Query to Return Rowgroups That Are Not Benefitting from Vertipaq
Optimization
SELECT DISTINCT
objects.name,
partitions.partition_number,
dm_db_column_store_row_group_physical_stats.row_group_id,
dm_db_column_store_row_group_physical_stats.has_vertipaq_
optimization
FROM sys.dm_db_column_store_row_group_physical_stats
INNER JOIN sys.objects
ON objects.object_id = dm_db_column_store_row_group_physical_stats.
object_id
INNER JOIN sys.partitions
ON partitions.object_id = objects.object_id
AND partitions.partition_number = dm_db_column_store_row_group_physical_
stats.partition_number
WHERE objects.name = 'Sale'
AND dm_db_column_store_row_group_physical_stats.has_vertipaq_optimization
IS NOT NULL
AND dm_db_column_store_row_group_physical_stats.has_vertipaq_
optimization = 0
ORDER BY dm_db_column_store_row_group_physical_stats.row_group_id;
217
Chapter 13 Nonclustered Rowstore Indexes on Columnstore Tables
The results show that only three partitions (3, 4, and 5) contain rowgroups that lack
Vertipaq optimization. Therefore, periodic index maintenance could be targeted at
only those partitions, skipping the rest. For less granular details, row_group_id can be
omitted from the query to reduce the result set to a list of partitions that does not contain
rowgroup IDs.
I ndexed Views
It is possible to create indexed views on top of clustered columnstore indexes. While
this represents an added layer of complexity, it allows for queries that do not follow
the natural ordering of the columnstore index to be isolated into their own separate
data structure. There, the view and its indexes can be modified freely without directly
impacting the underlying table.
This can allow for more flexibility when performance testing supplementary indexes
on data that is stored in a clustered columnstore index. The script in Listing 13-9 rebuilds
the columnstore index on Fact.Sale, creates a schemabound view against it, and adds a
pair of indexes to the new view.
Listing 13-9. Script to Rebuild a Table and Add an Indexed View to a Clustered
Columnstore Index
ALTER INDEX [CCX_Fact_Sale] ON fact.sale REBUILD;
GO
CREATE VIEW Fact.v_Sale
WITH SCHEMABINDING
AS
SELECT
[Sale Key],
[City Key],
[Customer Key],
[Bill To Customer Key],
[Stock Item Key],
[Invoice Date Key],
[Delivery Date Key],
[Salesperson Key],
[WWI Invoice ID],
218
Chapter 13 Nonclustered Rowstore Indexes on Columnstore Tables
Description,
Package,
Quantity,
[Unit Price],
[Tax Rate],
[Total Excluding Tax],
[Tax Amount],
Profit,
[Total Including Tax],
[Total Dry Items],
[Total Chiller Items],
[Lineage Key]
FROM Fact.Sale;
GO
CREATE UNIQUE CLUSTERED INDEX CI_v_sale
ON Fact.v_Sale ([Sale Key], [Invoice Date Key]);
GO
CREATE NONCLUSTERED INDEX IX_v_Sale
ON Fact.v_Sale ([Stock Item Key], Quantity)
GO
Rebuilding the columnstore index ensures that the index is in pristine condition
prior to further testing. The view created in this demonstration includes all columns in
the table, but could be revised to include less columns, join other tables, add computed
columns, etc. The next step is to create a unique clustered index on the view. This step
is necessary as nonclustered indexes on a view require a unique index as a prerequisite.
Finally, a nonclustered covering index is added to handle queries against its columns.
Once complete, the analytic query in Listing 13-2 can be modified to target the view
instead, as shown in Listing 13-10.
SELECT
COUNT(*),
SUM(Quantity) AS total_quantity
FROM Fact.v_Sale
WHERE [Stock Item Key] = 186;
219
Chapter 13 Nonclustered Rowstore Indexes on Columnstore Tables
The results of the query are returned quickly. The execution plan and STATISTICS IO
output are provided in Figure 13-7.
Figure 13-7. Execution plan and STATISTICS IO output for a query against an
indexed view
The execution plan confirms that the nonclustered index on the view v_Sale
was used. The IO output shows acceptably low reads against the view. There is an
unexpected side effect of using an indexed view for managing select reads against a
columnstore index: Vertipaq optimization is still used when new rows are inserted into
the columnstore index!
While this is a useful benefit of indexed views against columnstore indexes, care
should still be exercised when adding complexity to an existing table. Balance the needs
of unusual workloads against a columnstore index vs. the cost of maintaining additional
views and the indexes against those views.
Always test queries without supporting views or indexes first. Confirm their
performance thoroughly and if needed test the impact of additional indexes or indexed
views to confirm whether the additions are worth the cost.
220
Chapter 13 Nonclustered Rowstore Indexes on Columnstore Tables
Consider the nonclustered primary key defined on Fact.Sale. The index size data in
Figure 13-5 shows that this index consumes 6,624KB. The script in Listing 13-11 drops
this constraint and replaces it with a new version that uses page compression.
When complete, the code in Listing 13-12 can be executed to confirm the new size of
the index, after compression is applied.
SELECT
indexes.name AS Index_Name,
SUM(dm_db_partition_stats.used_page_count) * 8 Index_Size_KB,
SUM(dm_db_partition_stats.row_count) AS Row_Count
FROM sys.dm_db_partition_stats
INNER JOIN sys.indexes
ON dm_db_partition_stats.object_id = indexes.object_id
AND dm_db_partition_stats.index_id = indexes.index_id
INNER JOIN sys.tables
ON tables.object_id = dm_db_partition_stats.object_id
INNER JOIN sys.schemas
ON schemas.schema_id = tables.schema_id
WHERE indexes.name = 'PK_Sale'
GROUP BY indexes.name;
221
Chapter 13 Nonclustered Rowstore Indexes on Columnstore Tables
Figure 13-8. Index size for a page compressed nonclustered primary key
The results show that the index now consumes only 2,736KB, a reduction of 3,888KB
or about 59%. This is a significant space savings and illustrates the positive impact
that compression can have when applied to secondary nonclustered indexes on a
columnstore indexed table.
222
Chapter 13 Nonclustered Rowstore Indexes on Columnstore Tables
223
CHAPTER 14
Columnstore Index
Maintenance
Depending on its usage, a columnstore index may require no maintenance at all,
infrequent maintenance, or regular maintenance to ensure optimal storage, resource
consumption, and performance.
Understanding how and when to use index maintenance will ensure that analytic
queries continue to perform well both when a columnstore index is first created and
years into the future.
When data is deleted in a columnstore index, no rows are actually removed. The
cost to decompress, delete rows, and recompress rowgroups is prohibitively expensive at
runtime. Instead, SQL Server flags the rows as deleted using the delete bitmap, and when
queries access that data, they use the bitmap to identify any rows that are deleted and
need to be skipped.
Chapter 9 provides extensive detail as to how DELETE and UPDATE operations
perform on columnstore indexes and outlines ways of managing them to avoid
persistent performance challenges and minimize fragmentation. The net impact of
deletion on columnstore indexes will be
225
© Edward Pollack 2022
E. Pollack, Analytics Optimization with Columnstore Indexes in Microsoft SQL Server,
https://doi.org/10.1007/978-1-4842-8048-5_14
Chapter 14 Columnstore Index Maintenance
This columnstore index is pristinely ordered by Order Year, with each set of
rowgroups containing rows in ascending order by date. If a query is executed that
calculates metrics for 2016, it can read the first ten rowgroups while ignoring the
other 140.
226
Chapter 14 Columnstore Index Maintenance
Figure 14-2 shows the result of an UPDATE operation that alters data for 100,000
rows in 2016, 50,000 rows in 2017, and 25,000 rows in 2018.
When rows were updated, they remained in the columnstore index in their previous
locations, but were flagged as deleted. A new rowgroup was created that contains the
new versions of the deleted rows. The result is that there is now an open rowgroup that
contains data for 2016, 2017, and 2018. In addition, when new data is inserted into the
table, it will also be added to the open rowgroup, resulting in another year’s worth of
data being crammed into a single rowgroup.
Going forward, any query that requires data from 2016, 2017, 2018, or the current
year will need to read this unordered rowgroup. If updates like this are common, then
older rowgroups will quickly become logjammed with deleted rows, while newer
rowgroups become unordered messes of data from many dates. The result will be wasted
storage, wasted memory, and slower queries that need to read far more data than should
be needed to return results.
Unordered inserts will have a similar impact as the INSERT portion of an UPDATE
statement. Inserting data into the columnstore index in Figure 14-1 from 2015 would
result in new rowgroups that contain both new and old data. Unordered inserts will, over
a long period of time, result in the inability of SQL Server to take advantage of rowgroup
elimination as more and more rowgroups contain data from all different time periods.
Delta rowgroups are a key part of columnstore index architecture and ensure
that write operations can occur as quickly as possible. They slow down reads slightly,
though, as rows reside in a b-tree structure and are not compressed with columnstore
compression. The impact of the delta store on read performance is not significant,
but administrators interested in maximizing columnstore read performance would be
interested in compressing them as soon as possible after a data load process completes.
227
Chapter 14 Columnstore Index Maintenance
SELECT
tables.name AS table_name,
indexes.name AS index_name,
partitions.partition_number,
column_store_row_groups.row_group_id,
column_store_row_groups.total_rows,
column_store_row_groups.deleted_rows
FROM sys.column_store_row_groups
INNER JOIN sys.indexes
ON indexes.index_id = column_store_row_groups.index_id
AND indexes.object_id = column_store_row_groups.object_id
INNER JOIN sys.tables
ON tables.object_id = indexes.object_id
INNER JOIN sys.partitions
ON partitions.partition_number = column_store_row_groups.partition_number
AND partitions.index_id = indexes.index_id
228
Chapter 14 Columnstore Index Maintenance
Figure 14-3. Deleted row counts per rowgroup for a columnstore index
This detail shows how many rows are deleted per rowgroup. The results can
be aggregated to show deleted rows per partition or for the entire table. If a table is
partitioned, then knowing if the deleted rows exist only in one partition or all of them is
useful for determining if all or only some partitions require attention.
In the example results outlined in Figure 14-3, about 5% of the columnstore index is
comprised of deleted rows that are spread across the index relatively evenly. As a rough
guideline, there is no urgent need for index maintenance to address this unless the
percentage of deleted rows is at least 10% of the total rows in a partition or in the table.
Keep in mind that the performance impact of deleted rows is gradual over time.
There will never be a scenario where a threshold is reached in which performance
suddenly plummets. Therefore, automating index maintenance to occur when deleted
rows exceed some set percentage is an effective way to avoid accidentally allowing an
index to become absurdly fragmented.
229
Chapter 14 Columnstore Index Maintenance
Listing 14-2. Query to Retrieve Min/Max Data IDs for a Given Column in a
Columnstore Index
SELECT
tables.name AS table_name,
indexes.name AS index_name,
columns.name AS column_name,
partitions.partition_number,
column_store_segments.segment_id,
column_store_segments.min_data_id,
column_store_segments.max_data_id,
column_store_segments.row_count
FROM sys.column_store_segments
INNER JOIN sys.partitions
ON column_store_segments.hobt_id = partitions.hobt_id
INNER JOIN sys.indexes
ON indexes.index_id = partitions.index_id
AND indexes.object_id = partitions.object_id
INNER JOIN sys.tables
ON tables.object_id = indexes.object_id
INNER JOIN sys.columns
ON tables.object_id = columns.object_id
AND column_store_segments.column_id = columns.column_id
WHERE tables.name = 'Sale_CCI'
AND columns.name = 'Invoice Date Key'
ORDER BY tables.name, columns.name, column_store_segments.segment_id;
The results of this query provide useful insight into the data order within this index,
as shown in Figure 14-4.
230
Chapter 14 Columnstore Index Maintenance
In the metadata, note that the values for min_data_id and max_data_id are the same
for each rowgroup. This means that queries that filter on the column Invoice Date Key
will be forced to scan the entire table to return results as any value could be found in any
rowgroup.
Consider an ordered version of this table, where the results are as seen in Figure 14-5.
This version of the table shows a progression of values for min_data_id and
max_data_id that increases as the segment_id increases. Because each segment contains
a distinct grouping of column values, this metadata can be effectively used to skip any
rowgroups that contain values that are irrelevant to a query. The query in Listing 14-3
returns a complete list of all segments in a columnstore index that have any overlapping
values. The inequalities are not inclusive as it is common that the start and end values in
a rowgroup will overlap those in the next rowgroup.
231
Chapter 14 Columnstore Index Maintenance
WITH CTE_SEGMENTS AS (
SELECT
tables.name AS table_name,
indexes.name AS index_name,
columns.name AS column_name,
partitions.partition_number,
column_store_segments.segment_id,
column_store_segments.min_data_id,
column_store_segments.max_data_id,
column_store_segments.row_count
FROM sys.column_store_segments
INNER JOIN sys.partitions
ON column_store_segments.hobt_id = partitions.hobt_id
INNER JOIN sys.indexes
ON indexes.index_id = partitions.index_id
AND indexes.object_id = partitions.object_id
INNER JOIN sys.tables
ON tables.object_id = indexes.object_id
INNER JOIN sys.columns
ON tables.object_id = columns.object_id
AND column_store_segments.column_id = columns.column_id
WHERE tables.name = 'Sale_CCI_ORDERED'
AND columns.name = 'Invoice Date Key')
SELECT
CTE_SEGMENTS.table_name,
CTE_SEGMENTS.index_name,
CTE_SEGMENTS.column_name,
CTE_SEGMENTS.partition_number,
CTE_SEGMENTS.segment_id,
CTE_SEGMENTS.min_data_id,
CTE_SEGMENTS.max_data_id,
CTE_SEGMENTS.row_count,
OVERLAPPING_SEGMENT.partition_number AS
overlapping_partition_number,
232
Chapter 14 Columnstore Index Maintenance
OVERLAPPING_SEGMENT.segment_id AS overlapping_segment_id,
OVERLAPPING_SEGMENT.min_data_id AS overlapping_min_data_id,
OVERLAPPING_SEGMENT.max_data_id AS overlapping_max_data_id
FROM CTE_SEGMENTS
INNER JOIN CTE_SEGMENTS OVERLAPPING_SEGMENT
ON (OVERLAPPING_SEGMENT.min_data_id > CTE_SEGMENTS.min_data_id
AND OVERLAPPING_SEGMENT.min_data_id < CTE_SEGMENTS.max_data_id)
OR (OVERLAPPING_SEGMENT.max_data_id > CTE_SEGMENTS.min_data_id
AND OVERLAPPING_SEGMENT.max_data_id < CTE_SEGMENTS.max_data_id)
OR (OVERLAPPING_SEGMENT.min_data_id < CTE_SEGMENTS.min_data_id
AND OVERLAPPING_SEGMENT.max_data_id > CTE_SEGMENTS.max_data_id)
ORDER BY CTE_SEGMENTS.partition_number, CTE_SEGMENTS.segment_id
This query evaluates the boundaries for the minimum and maximum value of one
column within each rowgroup and determines if any other rowgroups overlap those values.
The list returned by the query in Figure 14-6 may appear long at first glance, but it
is important to note that any columnstore index that has been the target of UPDATE
operations or unordered inserts will have entries here. Looking at the data returned, it
can be seen that the out-of-order data is spread somewhat evenly across segments, with
each segment containing 1–4 other segments that overlap at least one value with it.
Figure 14-6. List of overlapping values within rowgroups for the Invoice Date
Key column
While there is no precise way to measure the percentage of unordered data in the
same way that it was possible to measure the percentage of rows in a columnstore
index that are deleted, it is possible to gauge how effectively data order impacts query
performance by performing metadata tests using COUNT(*) queries against the
columnstore index. This could be done for every date in the table, which would result in
a very thorough experiment. For the sake of a simple demonstration, eight sample dates
will be chosen at random to test, as given by the query in Listing 14-4.
233
Chapter 14 Columnstore Index Maintenance
Listing 14-4. Sample Dates to Test How Ordered Data Is in a Columnstore Index
SELECT
[Invoice Date Key],
COUNT(*) AS Sale_Count
FROM Fact.Sale_CCI_ORDERED
WHERE [Invoice Date Key] IN ('5/1/2013', '9/5/2013', '1/17/2014',
'6/30/2014', '3/14/2015', '12/12/2015', '1/1/2016', '2/29/2016')
GROUP BY [Invoice Date Key]
ORDER BY [Invoice Date Key]
The results in Figure 14-7 provide row counts for each data.
Figure 14-7. List of sample dates for use in testing the effectiveness of columnstore
data order
Each data point chosen contains at most 0.1% of the data in the table as there are
25,109,150 total rows, of which these encompass eight of its dates. Based on the row
counts and table size, an ideal ordered table would only require reading 1–2 rowgroups
to retrieve data for any of those given dates. The query in Listing 14-5 executes separate
COUNT(*) queries for each date identified earlier.
234
Chapter 14 Columnstore Index Maintenance
Figure 14-8 shows the STATISTICS IO output for each preceding query.
For each of the eight COUNT(*) sample queries, 3–4 rowgroups are read in order
to retrieve the count. Based on the knowledge that these queries should read 1–2
rowgroups, it can be deduced that queries are generally reading two to three times more
rowgroups than is necessary. Depending on how often updates and unordered inserts
235
Chapter 14 Columnstore Index Maintenance
occur on the table, this may be acceptable or it may be unusual. Knowledge of the table’s
usage helps in understanding how extreme these numbers are and if reading three
rowgroups instead of one is normal or worthy of attention.
Testing rowgroups read and skipped using STATISTICS IO is an effective way to
gauge how ordered a columnstore index is. If there is a desire to be complete about this
test and use count queries against many (or all) dates in the table, consider treating them
as maintenance and performing that research at a predetermined time when running a
lengthy maintenance script is acceptable.
If those three conditions can be met, then a columnstore index can be spared of
nearly all maintenance. Whether any is performed is up to the whim of its administrator.
Realistically, the only suboptimal scenario that can arise when the columnstore index
is not the target of deletes, updates, or unordered inserts is that rowgroups may be
undersized due to the use of the delta store to process small INSERT operations.
The impact of undersized rowgroups resulting from delta store usage is minor and
should not be seen as an urgent problem in need of an immediate resolution. In these
scenarios, waiting for infrequent maintenance periods to perform columnstore index
maintenance would be more than effective enough to ensure that undersized rowgroups
are merged effectively. Quarterly or yearly is likely often enough.
Now that the causes of fragmentation have been thoroughly detailed, using index
maintenance operations to resolve fragmentation can be discussed.
236
Chapter 14 Columnstore Index Maintenance
Columnstore Reorganize
The fastest and simplest operation available for a columnstore index is the
REORGANIZE. This is an online operation that accomplishes the following tasks:
If both of these tasks apply to a set of rowgroups, then a merge operation will be
prioritized over a self-merge. Rowgroups that were trimmed due to dictionary pressure
cannot be combined with other rowgroups, regardless of their row counts.
To demonstrate the merge and self-merge operations that can be used by a
columnstore REORGANIZE operation, a large set of rows will be deleted from a
columnstore index, as seen in Listing 14-6
DELETE
FROM Fact.Sale_CCI_ORDERED
WHERE [Invoice Date Key] <= '1/17/2013';
Figure 14-9 shows the resulting set of deleted rows within the columnstore index’s
rowgroup metadata using the same query as provided in Listing 14-1.
237
Chapter 14 Columnstore Index Maintenance
The second column to the right shows total rows in the rowgroup, whereas the
rightmost column provides the deleted row count. Of the rowgroups with deleted rows,
only rowgroups 23 and 24 have more than 102,400 rows deleted and would qualify for a
self-merge operation. These rowgroups are also valid targets for a columnstore merge
operation as they can be combined, with the resulting rowgroup containing less than the
row cap (220 rows) for a columnstore rowgroup.
The syntax for a columnstore REORGANIZE operation is shown in Listing 14-7.
Note that rowgroups 23 and 24 are now flagged as TOMBSTONE and will be cleaned
up by the tuple mover at some point in the near future. Two new rowgroups were created
(25 and 26) that replace them, with the deleted rows removed. The self-merge operation
essentially creates new rowgroups, copies all nondeleted rows into them, and swaps
them in as the active rowgroups while the previous versions are flagged for cleanup.
The resulting rowgroups are free of the burden of deleted rows. Remember that the
self-merge only occurs when more than 102,400 rows in a rowgroup are deleted.
After a minute passes, the rowgroup metadata confirms that the rowgroups labeled
as TOMBSTONE are now removed from the columnstore index, as seen in Figure 14-11.
This is an automatic cleanup process that requires no operator intervention to trigger.
238
Chapter 14 Columnstore Index Maintenance
Figure 14-11. Rowgroup metadata after the tuple mover removes TOMBSTONE
rowgroups
Rowgroups can only be combined if the reason they are undersized is not related to
dictionary pressure. Figure 14-12 shows additional metadata from sys.dm_db_column_
store_row_group_physical_stats for this columnstore index.
239
Chapter 14 Columnstore Index Maintenance
The upside of this operation is that it ensures faster read operations, as there is no
need to read the rowstore structure of the delta store when processing queries. The
downside is that it is an additional maintenance option that requires time and resources
to execute. Consider the INSERT operation shown in Listing 14-8.
One row is inserted into the columnstore index. Rowgroup metadata can confirm the
new row that resides in an open delta rowgroup, as seen in Figure 14-13, using the same
query as in Listing 14-1.
Figure 14-13. Rowgroup metadata for a newly inserted row into a delta rowgroup
With a single row in the delta store, a REORGANIZE operation will be run using the
COMPRESS_ALL_ROW_GROUPS option, as seen in Listing 14-9.
240
Chapter 14 Columnstore Index Maintenance
The metadata shows a single row in a compressed rowgroup, and the old delta store
set to TOMBSTONE, awaiting cleanup. Figure 14-15 shows the rowgroup metadata after
the TOMBSTONE rowgroup is cleaned up.
241
Chapter 14 Columnstore Index Maintenance
Columnstore Rebuild
Rebuilding a columnstore index functions similarly to rebuilding a rowstore index.
When a REBUILD is issued, a completely new copy of the columnstore index is created,
replacing the old index. While an expensive process, the results are
• Rowgroups that are filled to capacity, whenever possible.
242
Chapter 14 Columnstore Index Maintenance
ensures that queries can continue to use the columnstore index, even as a rebuild
operation is running.
Rebuilding a clustered columnstore index can only be accomplished as an online
operation starting in SQL Server 2019. Nonclustered indexes can be rebuilt online
regardless of SQL Server version.
Consider the columnstore index that has been tested recently in this chapter. The
T-SQL in Listing 14-10 issues a REBUILD against that index.
After a REBUILD, the columnstore index has no deleted rows, and rowgroups
are mostly full. Oddly enough, the last two rowgroups (23 and 24) are undersized
residuals and can be cleaned up via an index REORGANIZE operation. The metadata
in Figure 14-18 shows the final results after the index is subject to an additional
REORGANIZE after the REBUILD.
243
Chapter 14 Columnstore Index Maintenance
Finally, the index is pristine, with 23 rowgroups that are completely full and 1
additional rowgroup that is leftover from the index maintenance operations.
Index REBUILD operations should be used infrequently to manage one of a few
scenarios:
• Extensive deleted rows that number less than 102,400 rows per
rowgroup and cannot be addressed by REORGANIZE operations.
Note that when an index REBUILD is issued, the compression type may be changed
to or from columnstore archive compression, if needed. Because an index REBUILD
is an expensive operation and is offline prior to SQL Server 2019, care should be taken
to execute rebuilds during maintenance windows when such work is more tolerable to
processes that use this data.
Sometimes, data can become heavily fragmented via software releases that make
significant changes to the underlying schema and data. If this is anticipated, then
scheduling an index REBUILD as a postprocessing step after the software release would
be an excellent way to ensure that data continues to be efficiently delivered to analytic
processes, even after a disruptive software release.
REBUILD operations may target a specific partition, thus allowing only data that is
heavily fragmented to be rebuilt. For a large columnstore index in which only a small
portion is actively written to, this is an excellent way to speed up rebuilds and minimize
disruptions to the users of analytic data.
244
Chapter 14 Columnstore Index Maintenance
The clustered rowstore index is used to enforce a new data order on the contents of
the table, whereas the new columnstore index replaces it, retaining the new data order.
This is an expensive and disruptive process that will result in analytic queries
being unable to take advantage of the columnstore index from the point when the
index is dropped until the new columnstore index is completely built. Therefore, it is a
worthwhile process to implement during a maintenance window when causing trouble
is more acceptable.
The script in Listing 14-11 performs these operations to reorder data from its current
state to data that is ordered perfectly by Invoice Date Key.
245
Chapter 14 Columnstore Index Maintenance
Despite its complexity, the process to fix unordered data can be executed on a
partition-by-partition basis, thus allowing only active partitions containing warm/hot
data to be targeted with expensive index maintenance operations. For a columnstore
index with many partitions, this can save considerable time and reduce disruption and
downtime to analytic processes.
246
Chapter 14 Columnstore Index Maintenance
The primary difference here is that a REBUILD can eliminate out-of-order data
resulting from UPDATE operations. When rebuilt, a nonclustered columnstore index
will be created using the order prescribed by the clustered rowstore index, which is not
negatively impacted by UPDATE operations in the same way as a columnstore index.
Both REORGANIZE and REBUILD operations are available as online operations
for nonclustered indexes, providing more flexibility when trying to schedule recurring
(or one-time) maintenance. This means that real-time operational analytics that target
a nonclustered columnstore index can continue efficiently, even as that index is being
rebuilt.
Like with clustered columnstore indexes, maintenance on nonclustered columnstore
indexes can be issued against any or all partitions, allowing active data to be targeted
with maintenance while unchanging data can be skipped.
247
CHAPTER 15
Columnstore Index
Performance
The ultimate measure of performance for any data structure is the speed in which data
can be retrieved. In columnstore indexes, the time required to return data will be a
function of two operations:
• Metadata reads
• Data reads
This chapter will walk through the steps needed to measure, assess, and tune
columnstore index query performance, both for analytics and write operations. It
will also introduce additional options for columnstore index usage and performance
optimization.
SELECT
SUM(Quantity) AS Total_Quantity,
SUM([Total Excluding Tax]) AS Total_Excluding_Tax
FROM Fact.Sale_CCI_PARTITIONED
WHERE [Invoice Date Key] = '7/17/2015';
249
© Edward Pollack 2022
E. Pollack, Analytics Optimization with Columnstore Indexes in Microsoft SQL Server,
https://doi.org/10.1007/978-1-4842-8048-5_15
Chapter 15 Columnstore Index Performance
Two columns are aggregated for a single value of Invoice Date Key. This table is
both ordered and partitioned and therefore will benefit significantly from partition
elimination and rowgroup elimination. When the query is executed, columnstore
metadata is consulted to determine which segments need to be read.
The table Fact.Sale_CCI_PARTITIONED is partitioned on the Invoice Date Key
column by year, with partitions assigned to the years 2013, 2014, 2015, 2016, and 2017.
Listing 15-2 provides the definitions for both the partition scheme and function used in
this demonstration.
The table contains data ranging from 2013 through 2016 and therefore will not make
use of the last partition. When executed, the query in Listing 15-1 will immediately check
the filter against the partition function and then use the partition scheme to determine
where data is located based on the function. Based on this information, it is determined
that only data in the partition WideWorldImportersDW_2015_fg will be needed.
Columnstore metadata is stored separately for each partition, and therefore when
executed, this query needs to only consult metadata relevant to the partition containing
data for the year 2015. Figure 15-1 shows the segment metadata for Invoice Date Key in
this columnstore index.
250
Chapter 15 Columnstore Index Performance
SELECT
partitions.partition_number,
objects.name,
columns.name,
column_store_dictionaries.type,
column_store_dictionaries.entry_count,
column_store_dictionaries.on_disk_size
FROM sys.column_store_dictionaries
INNER JOIN sys.partitions
ON column_store_dictionaries.hobt_id = partitions.hobt_id
INNER JOIN sys.objects
ON objects.object_id = partitions.object_id
INNER JOIN sys.columns
ON columns.column_id = column_store_dictionaries.column_id
AND columns.object_id = objects.object_id
WHERE objects.name = 'Sale_CCI_PARTITIONED'
AND columns.name = 'Invoice Date Key'
AND column_store_dictionaries.dictionary_id = 0;
251
Chapter 15 Columnstore Index Performance
The results provide additional detail about the dictionary used for this column, as
shown in Figure 15-2.
This shows that partitions 1–3 share a single dictionary that is a compact 1308 bytes.
The scenario presented here is essentially the optimal columnstore index and query. The
index is well ordered and partitioned and the query aligns perfectly with that order to
minimize the amount of metadata read as part of its execution.
While columnstore metadata may not seem large when compared to the data itself,
it is important to remember that a columnstore index with one billion rows will have
approximately one thousand rowgroups. Each of those rowgroups will contain one
segment per column. For this table which contains 21 columns, metadata will consist
of about 21,000 segments per billion rows. Needing to read metadata on rowgroups,
segments, dictionaries, and more can add up to a nontrivial amount of work for SQL
Server. As with data itself, metadata needs to be loaded into memory before it can be
read. Therefore, maintaining an ordered table and partitioning (if needed) can ensure
that excessive metadata is not read when executing analytic queries.
Returning to the analytic query in Listing 15-1, it can be expected that it would
execute quickly, resulting in minimal metadata reads, as well as minimal data reads. The
STATISTICS IO output for the query is shown in Figure 15-3.
252
Chapter 15 Columnstore Index Performance
253
Chapter 15 Columnstore Index Performance
While the entire columnstore index resides on a storage system, typically only
a fraction of it will be maintained in the buffer pool. Segments remain compressed
throughout this entire IO process until they are needed by a query, at which time they
are decompressed, and data is returned to SQL Server.
Memory Sizing
All of the performance discussions thus far have focused on reducing IO. Despite all
efforts to minimize unnecessary IO, larger columnstore indexes will ultimately require
reading significant numbers of segments from storage into memory.
The next step in ensuring highly performant columnstore indexes is to accurately
estimate the amount of memory that should be allocated to SQL Server to support
analytic queries.
Too little memory and data will be constantly removed from memory, only to be
read back into the buffer pool when needed again soon. The process of reading data
from storage (even fast storage) will be far greater than other steps in executing analytic
queries.
Too much memory would represent unused computing resources, which translate
into wasted money.
254
Chapter 15 Columnstore Index Performance
4. Approximate data growth over time that will adjust the numbers
calculated earlier.
Any data that is not hot or warm is expected to be cold and rarely used. It may still
be valuable but is not accessed often enough to architect an entire system around it.
Resources can be allocated at runtime by Azure, AWS, or other hosting services, if that
level of flexibility is desired, but that is entirely optional and at the discretion of an
organization and the importance of speedy access to cold data.
Consider a hypothetical columnstore index as shown in Listing 15-4.
Listing 15-4. Row Counts and Size for Data in a Columnstore Index
This index shows a common trend in analytics by which the most recent year is
heavily used for current reporting. The previous year is also accessed frequently for year-
over-year analytics. All previous data is important and must be retained, but is not read
very often when compared to newer data.
Based on the numbers provided, if the goal was to ensure that the most frequently
used data was accounted for in memory allocation, then this columnstore index would
warrant 18GB of memory to ensure that all 18GB of hot data can reside in memory,
if needed.
If the 10GB of warm data were also important, then allocating some amount of
memory up to 10GB would help ensure that data is not cycled in and out of memory too
often and that it does not replace more important hot data in memory when requested.
255
Chapter 15 Columnstore Index Performance
If resources were plentiful, then allocating the full 10GB would accomplish the task in its
entirety. Otherwise, the organization responsible for this data would need to determine
how important the warm data is and allocate some amount of additional memory up
to 10GB to cover it. For example, if it was estimated that the latter half of the 2020 data
would be requested on a somewhat regular basis, but the earlier half would be far
less used, then 5GB would be a good estimate of memory to allocate to this block of
warm data.
The remaining 9GB of data from 2017 to 2019 would not receive any memory
allocation as they are rarely read and would impact performance too infrequently to
matter to most organizations. If infrequent analytics or reports are disruptive enough, an
argument could be made to adding further memory to reduce that disruption, but this
would be something that should be considered on a case-by-case basis.
This example also illustrates a fairly straightforward growth of about two times per
year. If this rate of growth is expected to continue and the years with warm and hot data
are expected to roll forward each year, then in a year, the current hot data (18GB) would
become warm data, the current warm data (10GB) would become cold data, and next
year’s data (~35GB) would become the new hot data. Therefore, year-over-year growth
would be represented by
Note that the size of columnstore data used in memory will often not be the same
as the total data size. A columnstore index with 20 columns may not see all columns
used equally. It is quite possible that some columns may rarely get used. If this is the
case, then memory estimations can adjust the amount needed downward to account
for columns that are not needed in memory often. While these infrequently queried
columns cannot be classified as cold data, they can be discounted from memory totals to
ensure that memory estimates are not overinflated by segments that are rarely needed.
For example, if the 18GB of data for 2021 included 5GB of columns that are rarely
queried, then the memory estimate for that data could be reduced from 18GB to as little
as 13GB, if there is confidence in how infrequently those columns are used.
There is a great deal of estimation that goes into this work. At the absolute high end,
a memory budget could encompass all space consumed by a columnstore index. At the
lowest level, no memory could be allocated for these analytics. A realistic estimation will
be somewhere in the middle and will be influenced by the performance required of this
data and the availability/cost of computing resources.
256
Chapter 15 Columnstore Index Performance
257
Chapter 15 Columnstore Index Performance
The most common scenarios involving dictionary pressure are derived from
columns that are both wide and have many distinct values. There is no one-size-fits-all
solution to the challenge of dictionary pressure. The three most common solutions can
be considered as the simplest ways of resolving this challenge:
258
Chapter 15 Columnstore Index Performance
Partitioning
One key element of partitioned columnstore indexes is that each partition contains its
own distinct columnstore index. For example, a table that is broken into ten partitions
will contain a separate columnstore structure in each partition. That includes separate
dictionaries, as well!
When columnstore index row counts become large (hundreds of millions or billions
of rows), partitioning can provide distinct performance and maintenance benefits in
SQL Server. In addition to partition elimination, an added performance improvement
can be seen in the separation of dictionaries across partitions. This solution is especially
effective when the values for wide columns change slowly over time. In these scenarios,
the data from one day to the next will have many repeated values, but across months or
years will not.
When partitioning a table, be sensitive to the size of the partitions that are created.
Because each represents a distinct, separate columnstore index, it is important that each
be large enough to benefit from the key properties of columnstore indexes. Therefore,
ensure that each partition has at least tens of millions of rows. Similarly, partitions
that are exceptionally large (billions of rows) may suffer the same problems as an
unpartitioned table with regard to dictionary pressure.
Testing is important in confirming the benefits of partitioning in a columnstore
index. Generally speaking, if a table contains hundreds of millions of rows (or more),
being able to subdivide it into smaller chunks using the columnstore ordering column
259
Chapter 15 Columnstore Index Performance
Listing 15-5. Query That Creates a Temporary Table, Populates It, and Adds a
Columnstore Index
CREATE TABLE #Sales_Temp_Data
( [Sale Key] BIGINT NOT NULL,
[Customer Key] INT NOT NULL,
[Invoice Date Key] DATE NOT NULL,
Quantity INT NOT NULL,
[Total Excluding Tax] DECIMAL(18,2) NOT NULL);
260
Chapter 15 Columnstore Index Performance
[Sale Key], [Customer Key], [Invoice Date Key], Quantity, [Total
Excluding Tax]
FROM Fact.Sale_CCI_PARTITIONED;
Once added, the newly indexed temporary table may be queried as effectively as a
columnstore indexed permanent table, as seen in Listing 15-6.
SELECT
SUM(Quantity) * SUM([Total Excluding Tax])
FROM #Sales_Temp_Data
WHERE [Invoice Date Key] >= '1/1/2015'
AND [Invoice Date Key] < '1/1/2016';
These queries execute relatively quickly, returning the requested results. Figure 15-6
shows the STATISTICS IO output from that pair of queries.
Figure 15-6. IO for queries against a temporary table with a columnstore index
Note that the filtered query read almost half of the segments in the table, despite
only reading about a quarter of its rows. When this temporary table was populated, no
data order was enforced. As a result, data was inserted in whatever order SQL Server
happened to read it from the source table, which was not optimal. Depending on how
261
Chapter 15 Columnstore Index Performance
extensive the temporary table is to be used for analytics, taking the extra steps to order it
prior to reading it may or may not be worth the time and resources needed to do so. That
value will lie in whether or not the analytic processes are completing quickly enough and
if they are using too much computing resources along the way.
Nonclustered columnstore index may also be created on temporary tables, if there is
a need to slice data using both transactional and analytic methods. One benefit of doing
so is that the clustered rowstore index can enforce data order on the table, even if further
writes are made to it. Another benefit is that the column list for the columnstore index
can be customized. This allows a small subset of columns to be subject to analytics when
the remainder may be needed for other operations. Listing 15-7 shows the syntax for
creating nonclustered columnstore indexes on a temporary table.
The syntax for nonclustered columnstore indexes is identical for temporary tables
and permanent tables, and once created, they can be used in the same fashion.
262
Chapter 15 Columnstore Index Performance
Note that columnstore indexes are not allowed on table variables. The T-SQL in
Listing 15-8 shows an attempt to do so.
This query won’t even compile and will immediately generate the error shown in
Figure 15-7 when parsed.
Figure 15-7. Parsing error when trying to create a columnstore index on a table
variable
SQL Server provides the courtesy of raising an error during parsing, before the table
variable is created and populated.
Using columnstore indexes on temporary tables will not be a frequently applied use
case, but specific processes that rely heavily on the crunching of temporary data may be
able to use them effectively. Performance testing can be the ultimate test of whether this
is a helpful solution or one that does not provide enough value to be of worth.
263
Chapter 15 Columnstore Index Performance
264
Chapter 15 Columnstore Index Performance
With these basic configuration steps complete, we can experiment with a memory-
optimized table that contains a columnstore index. The script in Listing 15-10 shows the
creation of an example table.
265
Chapter 15 Columnstore Index Performance
This definition appears valid, but when executed, an error is returned, as shown in
Figure 15-9.
While this error message appears to provide a glimmer of hope that this table definition
would work if it is executed on a version of SQL Server later than SQL Server 2014, the result
is still failure. After some review, a table definition is crafted that accommodates these
limitations and allows a table to (finally) be created, as seen in Listing 15-12.
267
Chapter 15 Columnstore Index Performance
Success! This table contains a handful of features that are not present on disk-
based tables:
• MEMORY_OPTIMIZED = ON.
• DURABILITY = SCHEMA_AND_DATA.
The durability setting determines if this table’s data can be recovered when the
server is restarted or if only the schema is persisted. SCHEMA_ONLY is significantly
faster as there is no need to persist data to disk storage. In addition, startup is faster
as no data needs to be loaded into the table for it to be available. SCHEMA_ONLY is
typically used for tables that contain temporary data such as session, ETL, or transient
information that is not needed again once processed. SCHEMA_ONLY does not
support columnstore indexes, though, and therefore is out of scope for any further
discussion here.
Note that the columnstore index is labeled as a clustered columnstore index, but it
is not truly a clustered index. The primary storage mechanism for a memory-optimized
table is always the set of rows in memory. The columnstore index is an additional
structure on top of the memory-optimized object that is also persisted to disk storage.
These caveats incur quite a bit of overhead by SQL Server to maintain a columnstore
index alongside a memory-optimized table.
Also worthy of highlighting is the fact that this memory-optimized clustered
columnstore index contains numerous wide columns that are not ideal for dictionary
encoding. CustomerPurchaseOrderNumber, Comments, DeliveryInstructions, and
InternalComments are string columns that are unlikely to be repetitive. Therefore,
they are not likely to compress well and may cause dictionary pressure, resulting
in undersized rowgroups and further reduction in columnstore efficiency. This is
not a deal-breaker, but is essential to understand when considering implementing
a memory-optimized columnstore index. Tables that are built primarily for OLTP
workloads will often contain text data that is essential for transactional processing, but
that may be suboptimal for analytics. One possibility that could resolve a situation like
this would be to split the table into two, with the string columns and supporting data
in one table and the numbers and metrics in the other. This would allow one to be
given a memory-optimized columnstore index and the other to remain with its
memory-optimized rowstore structure.
268
Chapter 15 Columnstore Index Performance
With table creation and population complete, the sizes of the original Sales.Orders
table can be compared to its memory-optimized counterpart. Note the index contents of
each table:
Sales.Orders:
Sales.Orders_MOLTP:
• A one-column primary key index
269
Chapter 15 Columnstore Index Performance
Note in Figure 15-11 the significant size difference between the disk-based table
(25MB) and the memory-optimized table (117MB). That is a hefty space penalty and
underscores the fact that mapping a memory-optimized structure to a columnstore
structure is even more complex an operation than mapping a nonclustered rowstore
index to a clustered columnstore index. Before continuing, one additional
memory-optimized table will be created, as shown in Listing 15-14.
270
Chapter 15 Columnstore Index Performance
This table is identical to the previously created memory-optimized table, except that
the columnstore index has been swapped out for a nonclustered index on OrderDate.
The new table’s total size is 20MB, which is about one-sixth of the size of the columnstore
table, representing a significant space savings.
Consider a test analytic query that is executed against all three test tables, as shown
in Listing 15-15.
SELECT
COUNT(*) AS OrderCount,
COUNT(DISTINCT(CustomerID)) AS DistinctCustomerCount
FROM Sales.Orders
WHERE OrderDate >= '1/1/2015'
AND OrderDate < '4/1/2015';
SELECT
COUNT(*) AS OrderCount,
COUNT(DISTINCT(CustomerID)) AS DistinctCustomerCount
FROM Sales.Orders_MOLTP_NO_CCI
WHERE OrderDate >= '1/1/2015'
AND OrderDate < '4/1/2015';
SELECT
COUNT(*) AS OrderCount,
COUNT(DISTINCT(CustomerID)) AS DistinctCustomerCount
FROM Sales.Orders_MOLTP
WHERE OrderDate >= '1/1/2015'
AND OrderDate < '4/1/2015';
271
Chapter 15 Columnstore Index Performance
Results are returned from each query. The execution plans can be seen in
Figure 15-12.
Figure 15-12. Execution plans for a test analytic query against multiple tables
272
Chapter 15 Columnstore Index Performance
Even if a table meets these criteria, be sure to test thoroughly and be able to quantify
that the performance benefits of the memory-optimized columnstore index outweigh
the limitations and drawbacks. This is a unique architectural feature that will only be
applicable to a small fraction of high-volume, highly concurrent OLTP tables. More often
than not, other alternatives will be more feasible, such as
273
Chapter 15 Columnstore Index Performance
Optimization Strategies
Generally speaking, optimization is a process that should begin with solutions that
address the most common use cases. Tweaking and implementing additional layers of
complexity should only occur when one or more of the following are true:
274
Index
A Brute-force approach, 17
Bulk insert processes, 45, 132
AlwaysOn Availability Groups, 183, 184
Bulk loading
Analytic data
columnstore indexes, 112, 114
characteristics, 18
performance, columnstore
OLAP vs. OLTP data access patterns, 16
indexes, 114–118
size, 2–4
processes, 111, 112
structure, 4, 5
Analytic query pattern, 101
Analytic sales order table, 3 C
Analytic workloads, 112, 179 Clustered columnstore index, 179, 207
compressed segments, 46
data storage, 24
B Clustered rowstore index, 23
Backups, 30, 111, 158, 160 data storage, 21
Batch mode execution structure, 46
columnstore aggregation query, 101 Columnstore archive compression,
execution plan 68–70, 244, 273
clustered index scan, 105 Columnstore compression
columnstore index, 100 algorithms
fact and dimension tables, 102 dictionary encoding, 54–60
performance discrepancies, 108 row order (vertipaq)
row counts, 105 optimization, 62–65
row mode vs. batch mode run-length encoding, 65, 66
operation, 109 string data normalization, 60, 62
performance, 107–110 value encoding, 51–54
query, 100 data stored by columns, 49
Batch mode processing, 98, 105–107 data structure, 50
Batch mode vs. row mode processing, 110 life cycle, 71
Binary tree structure, 37, 45 repeated values, 50
Bit array compression, 66, 67 repetitive columns, 49
275
© Edward Pollack 2022
E. Pollack, Analytics Optimization with Columnstore Indexes in Microsoft SQL Server,
https://doi.org/10.1007/978-1-4842-8048-5
INDEX
276
INDEX
E L
Execution modes, 98, 101 Large objects (LOB), 67
Linear approximations, 29
F
M
Filegroup, 161
Memory-optimized columnstore
Filtered index, 195–198
indexes, 63
Filtered nonclustered columnstore
caveats, 273
indexes, 195–199
durability setting, 268
Filtered nonclustered rowstore
enabling, 264
indexes, 214, 215
error message, 267
Fragmentation
execution plans, test analytic query, 272
cause, 226
MAX length columns, 266
delete bitmap, 225
nonclustered columnstore indexes, 269
quantifying deleted rows, 228, 229
performance benefits, 273
unordered data, 230, 231, 233,
SCHEMA_ONLY, 268
234, 236
table creation, 265, 267, 270
unordered inserts, 227
test analytic query, 271
UPDATE operation, 227
Memory-optimized indexes, 25
Memory sizing, 254, 256
G Metadata reads, 249, 250, 252, 253
Microsoft Xpress compression algorithm, 69
Global dictionary, 56–58
N
H No-maintenance columnstore index, 236
Hot data, 43, 196 Nonclustered columnstore indexes
Huffman compression architecture, 42
algorithms, 69 benefits, 185
code changes, 199
compression delay, 188–194
I, J, K creation, 186
Indexed views, 218–220 DELETE and UPDATE operations, 185
Index maintenance, 160, 170, 247 execution plan, 186
INSERT operation, 132, 174, 236, 240 filtered, 195–199
Integer data compressed, 52, 53 hot, warm, and cold transactional
Internal columnstore index data, 187
objects, 93–95 index maintenance, 247
277
INDEX
278
INDEX
279
INDEX
280