DB2 - An Introduction To Materialized Query Tables
DB2 - An Introduction To Materialized Query Tables
The definition of a materialized query table (MQT) is based upon the result of a query. MQTs can
significantly improve the performance of queries. This article introduces you to MQTs, summary
tables, and staging tables, and shows you, by way of working examples, how to get up and running
with materialized query tables.
A materialized query table (MQT) is a table whose definition is based upon the result of a query. The data that is
contained in an MQT is derived from one or more tables on which the materialized query table definition is
based. Summary tables (or automatic summary tables, ASTs), which are familiar to IBM® DB2® Universal
Database™ (UDB) for Linux, UNIX®, and Windows® (DB2 UDB) users, are considered to be a specialized type
of MQT. The fullselect that is part of the definition of a summary table contains a GROUP BY clause
summarizing data from the tables that are referenced in the fullselect.
You can think of an MQT as a kind of materialized view. Both views and MQTs are defined on the basis of a
query. The query on which a view is based is run whenever the view is referenced; however, an MQT actually
stores the query results as data, and you can work with the data that is in the MQT instead of the data that is in
the underlying tables.
Materialized query tables can significantly improve the performance of queries, especially complex queries. If the
optimizer determines that a query or part of a query could be resolved using an MQT, the query might be
rewritten to take advantage of the MQT.
An MQT can be defined at table creation time as maintained by the system or maintained by the user. The
following sections introduce you to these two types of MQTs, as well as summary tables and staging tables. The
examples that follow require a connection to the SAMPLE database; if you don’t have the SAMPLE database
created on your system, you can create it by entering the db2sampl command from any command prompt.
Listing 1 shows an example of creating a REFRESH IMMEDIATE system-maintained MQT. The table, which is
named EMP, is based on the underlying tables EMPLOYEE and DEPARTMENT in the SAMPLE database.
Because REFRESH IMMEDIATE MQTs require that at least one unique key from each table referenced in the
query appear in the select list, we first define a unique constraint on the EMPNO column in the EMPLOYEE
table and on the DEPTNO column in the DEPARTMENT table. The DATA INITIALLY DEFERRED clause
simply means that data will not be inserted into the table as part of the CREATE TABLE statement. After being
http://www-128.ibm.com/developerworks/db2/library/techarticle/dm-0509melnyk/ Page 1 of 9
DB2 Basics: An introduction to materialized query tables 01/02/2006 07:09 PM
simply means that data will not be inserted into the table as part of the CREATE TABLE statement. After being
created, the MQT is in check pending state (see Demystifying table and table space states), and cannot be queried
until the SET INTEGRITY statement has been executed against it. The IMMEDIATE CHECKED clause specifies
that the data is to be checked against the MQT's defining query and refreshed; the NOT INCREMENTAL clause
specifies that integrity checking is to be done on the whole table. A query executed against the EMP materialized
query table shows that it is now fully populated with data.
connect to sample
...
32 record(s) selected.
connect reset
Listing 2 shows an example of creating a REFRESH DEFERRED user-maintained MQT. The table, which is
named ONTARIO_1995_SALES_TEAM, is based on the underlying tables EMPLOYEE and SALES in the
http://www-128.ibm.com/developerworks/db2/library/techarticle/dm-0509melnyk/ Page 2 of 9
DB2 Basics: An introduction to materialized query tables 01/02/2006 07:09 PM
named ONTARIO_1995_SALES_TEAM, is based on the underlying tables EMPLOYEE and SALES in the
SAMPLE database. Again, the DATA INITIALLY DEFERRED clause means that data will not be inserted into
the table as part of the CREATE TABLE statement. After being created, the MQT is in check pending state (see
Demystifying table and table space states), and cannot be queried until the SET INTEGRITY statement has been
executed against it. The MATERIALIZED QUERY IMMEDIATE UNCHECKED clause specifies that the table
is to have integrity checking turned on, but is to be taken out of check pending state without being checked for
integrity violations.
Next, to populate the MQT with some data, we will import data that had been exported from the EMPLOYEE and
SALES tables. The exporting query matches the defining query for the MQT. Then we will insert another record
into the ONTARIO_1995_SALES_TEAM table.
A query executed against the ONTARIO_1995_SALES_TEAM materialized query table shows that it is now
fully populated with the imported and inserted data, demonstrating that user-maintained MQTs can indeed be
modified directly.
connect to sample
...
3 record(s) selected.
connect reset
http://www-128.ibm.com/developerworks/db2/library/techarticle/dm-0509melnyk/ Page 3 of 9
DB2 Basics: An introduction to materialized query tables 01/02/2006 07:09 PM
connect reset
Summary tables
You will recall that a summary table is a specialized type of MQT whose fullselect contains a GROUP BY clause
summarizing data from the tables that are referenced in the fullselect. Listing 3 shows a simple example of
creating a summary table. The table, which is named SALES_SUMMARY, is based on the underlying table
SALES in the SAMPLE database. Once again, the DATA INITIALLY DEFERRED clause means that data will
not be inserted into the table as part of the CREATE TABLE statement. The REFRESH DEFERRED clause
means that the data in the table can be refreshed at any time using the REFRESH TABLE statement. A query
against this MQT right after it was created, but before the REFRESH TABLE statement was issued, returns an
error. After the REFRESH TABLE statement executes, the query runs successfully.
A subsequent insert operation into the SALES table, followed by a summary table refresh and a query against the
summary table, shows that the change to the underlying table is reflected in the summary table: salesperson Lee's
total sales in the Ontario-South region have increased by 100. Similar behavior can be observed in response to
update or delete operations against the underlying SALES table.
connect to sample
...
11 record(s) selected.
http://www-128.ibm.com/developerworks/db2/library/techarticle/dm-0509melnyk/ Page 4 of 9
DB2 Basics: An introduction to materialized query tables 01/02/2006 07:09 PM
11 record(s) selected.
11 record(s) selected.
11 record(s) selected.
11 record(s) selected.
connect reset
http://www-128.ibm.com/developerworks/db2/library/techarticle/dm-0509melnyk/ Page 5 of 9
DB2 Basics: An introduction to materialized query tables 01/02/2006 07:09 PM
Staging tables
You can incrementally refresh a REFRESH DEFERRED MQT if it has a staging table associated with it. The
staging table collects changes that need to be applied to synchronize the MQT with its underlying tables. You can
create a staging table using the CREATE TABLE statement; then, when the underlying tables of the MQT are
modified, the changes are propagated and immediately appended to the staging table. The idea is to use the
staging table to incrementally refresh the MQT, rather than regenerate the MQT from scratch. Incremental
maintenance provides significant performance improvement. The staging table is pruned when the refresh
operation is complete.
After it is created, a staging table is in a pending (inconsistent) state; it must be brought out of this state before it
can start collecting changes to its underlying tables. You can accomplish this by using the SET INTEGRITY
statement.
Listing 4 shows an example of using a staging table with a summary table. The summary table, which is named
EMP_SUMMARY, is based on the underlying table EMPLOYEE in the SAMPLE database. You'll recall that the
DATA INITIALLY DEFERRED clause means that data will not be inserted into the table as part of the CREATE
TABLE statement. The REFRESH DEFERRED clause means that the data in the table can be refreshed at any
time using the REFRESH TABLE statement. The staging table, which is named EMP_SUMMARY_S, is
associated with the summary table EMP_SUMMARY. The PROPAGATE IMMEDIATE clause specifies that any
changes made to the underlying table as part of an insert, update, or delete operation are cascaded to the staging
table. SET INTEGRITY statements are issued against both tables to take them out of their pending states.
Not unexpectedly, a query against the summary table at this point returns no data. The REFRESH TABLE
statement returns a warning, a reminder that the "integrity of non-incremental data remains unverified." This, too,
is not unexpected. Another query against the summary table returns no data as well. However, after we insert a
new row of data into the underlying EMPLOYEE table, a query against the staging table EMP_SUMMARY_S
returns one row, corresponding to the data that was just inserted. The staging table has the same three columns
that its underlying summary table has, plus two additional columns that are used by the system:
GLOBALTRANSID (the global transaction ID for each propagated row) and GLOBALTRANSTIME (the
timestamp of the transaction). Another query against the summary table returns no data, but after the REFRESH
TABLE statement executes this time, the query runs successfully.
connect to sample
...
0 record(s) selected.
http://www-128.ibm.com/developerworks/db2/library/techarticle/dm-0509melnyk/ Page 6 of 9
DB2 Basics: An introduction to materialized query tables 01/02/2006 07:09 PM
0 record(s) selected.
0 record(s) selected.
1 record(s) selected.
0 record(s) selected.
1 record(s) selected.
connect reset
Summary
The SYSCAT.TABDEP system catalog view contains a row for every dependency that a materialized query table
has on some other object. You can query this view to obtain a dependency summary for the MQTs that we have
created (Listing 5). MQTs have a DTYPE value of 'S.' The TABNAME column lists the names of the MQTs, and
the BNAME column lists the names of the database objects on which the corresponding MQTs depend. The
BTYPE column identifies the object type: 'T' for table, 'I' for index, and 'F' for function instance.
http://www-128.ibm.com/developerworks/db2/library/techarticle/dm-0509melnyk/ Page 7 of 9
DB2 Basics: An introduction to materialized query tables 01/02/2006 07:09 PM
Listing 5. Querying the SYSCAT.TABDEP system catalog view to see MQT dependencies on other
database objects
connect to sample
...
9 record(s) selected.
connect reset
We have seen that a materialized query table, whose definition is based upon the result of a query, can be thought
of as a kind of materialized view. MQTs are important because they can significantly decrease the response time
for complex queries. This article has introduced you to the basic concepts around maintained by system MQTs
and maintained by user MQTs, as well as summary tables and staging tables, and these concepts were illustrated
by working examples that you can run yourself. To learn more about materialized query tables, or for more
detailed information about any of the topics covered in this article, see the DB2 Information Center.
Resources
DB2 Universal Database for Linux, UNIX and Windows Support is the ideal place to locate resources such
as the Version 8.2 Information Center and PDF product manuals.
For the latest DB2 information online, including more detailed information about materialized query tables,
visit the DB2 Information Center.
Learn about Demystifying table and table space states in DB2 UDB.
Refer to the IBM DB2 Universal Database SQL Reference, Volume 1 and IBM DB2 Universal Database
SQL Reference, Volume 2 for detailed SQL documentation.
http://www-128.ibm.com/developerworks/db2/library/techarticle/dm-0509melnyk/ Page 8 of 9
DB2 Basics: An introduction to materialized query tables 01/02/2006 07:09 PM
Roman B. Melnyk , Ph.D., is a senior member of the DB2 Information Development team, specializing
in database administration, DB2 utilities, and SQL. During more than nine years at IBM, Roman has
written numerous DB2 books, articles, and other related materials. Roman coauthored DB2 Version 8:
The Official Guide (Prentice Hall Professional Technical Reference, 2003), DB2: The Complete
Reference (Osborne/McGraw-Hill, 2001), DB2 Fundamentals Certification for Dummies (Hungry
Minds, 2001), and DB2 for Dummies (IDG Books, 2000).
http://www-128.ibm.com/developerworks/db2/library/techarticle/dm-0509melnyk/ Page 9 of 9