0% found this document useful (0 votes)
59 views

02 - Implementation Methods S2

The document discusses different methods for implementing queries in Db2 for i including assembling queries into graphs and the optimizer choosing access methods. It also covers various ways to access permanent database objects like tables and indexes including table scans, table probes, index scans and index probes. The document provides details on indexing technologies used in Db2 for i like radix indexes.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
59 views

02 - Implementation Methods S2

The document discusses different methods for implementing queries in Db2 for i including assembling queries into graphs and the optimizer choosing access methods. It also covers various ways to access permanent database objects like tables and indexes including table scans, table probes, index scans and index probes. The document provides details on indexing technologies used in Db2 for i like radix indexes.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 110

Db2 for i

Implementation Methods

IBM Systems Lab Services – January 2019 Copyright 2019 IBM Corporation
Db2 for i
Query Graphs

Assembled into query "graphs"


Set of methods

Query 2

Query 1
Query 3

IBM Systems Lab Services – January 2019 Copyright 2019 IBM Corporation
Db2 for i
Choice of Access Methods

• The optimizer chooses the best method based on cost in


terms of CPU and I/O
– Cost is always determined relative to the optimization goal

• The optimizer tries to read and process only the data that is
actually required for the query result set

• Every choice involves tradeoffs and sometimes none of the


available choices is ideal

• Not having the right indexes limits both the metadata


needed to make choices and the methods available to
access the data

IBM Systems Lab Services – January 2019 Copyright 2019 IBM Corporation
Db2 for i

Access Methods for Permanent Objects

IBM Systems Lab Services – January 2019 Copyright 2019 IBM Corporation
Db2 for i
Table and Index Access Methods

• There are only two “real” database objects that can be accessed
by the query engine(s):
– Tables
– Indexes

• With these two objects there are a limited number of ways to


access the data they contain:
– Table Scan
– Table Probe
– Index Scan
– Index Probe

• To understand the advantages and disadvantages of each


method we have to separate…
– Data Access
– Data Processing

IBM Systems Lab Services – January 2019 Copyright 2019 IBM Corporation
Db2 for i
Permanent Object Access Methods

Object Types Scan Probe

Table

Radix Index

Encoded Vector Index

EVI Symbol Table

IBM Systems Lab Services – January 2019 Copyright 2019 IBM Corporation
Db2 for i
Table Scan Access Method

Read the table from beginning to end while applying any selection
criteria from the query.
• Advantages:
– More efficient I/O because of asynchronous prefetch, read ahead,
and large blocking factors.

• Disadvantages:
– All rows in the table are touched and tested even if they're not
needed for the query results.

• When to expect them:


– Used when the optimizer expects to need a large number of rows
from the table
– Optimization Goal is All I/O

IBM Systems Lab Services – January 2019 Copyright 2019 IBM Corporation
Db2 for i
Table Scan Access Method

NOTE
• Also known as "sequential access"
• A simple INSERT with VALUES plan uses "sequential access“
but the table is not being scanned

INSERT INTO department (deptnbr, deptname)


VALUES ( 500, ‘Accounting')

IBM Systems Lab Services – January 2019 Copyright 2019 IBM Corporation
Db2 for i
Table Scan Example

EMPLOYEE Table
Scan WORKDEPT
SELECT *
FROM employee B01
WHERE workdept = 'A01' B01
OR workdept = 'B01' G01
A01
G01
E01
A01
Need to access every *deleted*
row, and test to see if B01
it matches the
A01
selection
F01
*deleted*
B01
C01
A01
F01
A01

IBM Systems Lab Services – January 2019 Copyright 2019 IBM Corporation
Db2 for i
Table Scan Example - Graph

SELECT *
FROM employee
WHERE workdept = 'A01'
OR workdept = 'B01'

IBM Systems Lab Services – January 2019 Copyright 2019 IBM Corporation
Db2 for i
Table Probe Access Method

Reads a row from the table based upon a specific relative record
number (RRN) value derived from some other method. This results in
a random I/O operation against the table.
• Advantages:
– Smaller disk I/O operations can be used

• Potential disadvantages:
– May not be optimal if a large number of rows are selected resulting in
random disk I/O

• When to expect them:


– Used when data from the table is required for further processing
 Local selection on additional columns

 Projection of additional columns

– Optimization Goal is First I/O

IBM Systems Lab Services – January 2019 Copyright 2019 IBM Corporation
Db2 for i
Table Probe Example

Employee Table
SELECT * RRN WORKDEPT
FROM employee 001 B01
WHERE workdept = 'A01' 002 B01
003 G01
Perform a random 004 A01
probe into the table 005 G01
using the RRN 006 E01
value 007 A01
008 C01
009 D01
010 A01
011 C01
Produce a 012 F01
Relative Row 013 B01
Number (RRN) 014 C01
015 D01
for ‘A01’
016 F01
017 A01

IBM Systems Lab Services – January 2019 Copyright 2019 IBM Corporation
Db2 for i
Table Probe Example - Graph

SELECT *
FROM employee
WHERE workdept = 'A01'

IBM Systems Lab Services – January 2019 Copyright 2019 IBM Corporation
Db2 for i

Indexing Technology

IBM Systems Lab Services – January 2019 Copyright 2019 IBM Corporation
Db2 for i
Radix Index

• Key values are logically compressed


– Common patterns stored once
– Unique portion stored in "leaf" pages
– Positive impact on size and depth of the "tree"

• Algorithm used to find values


– Binary search
– Very efficient process to find a unique value or small range of values
– Index is re-organized automatically
– Used to materialize a bitmap or relative row number (RRN)

• Maintenance
– Index data is automatically spread across all available disk units
– Tree is automatically "rebalanced" to maintain an efficient structure
– No manual "index reorganization" required

IBM Systems Lab Services – January 2019 Copyright 2019 IBM Corporation
Db2 for i
Radix Index

DB Table
ROOT
ARKANSAS
MISSISSIPPI Test
MISSOURI
IOWA Node
ARIZONA MISS
...

AR
Key
compression

IOWA ISSIPPI OURI


IZONA KANSAS 004 002 003
005 001
• DISADVANTAGES:
– Table rows retrieved in order of key values
• ADVANTAGES: (not physical order) which equates to many
– Quick access to a single key value RANDOM I/Os when selecting a large
(million-entry index, on average, only number of keys (high cardinality)
20 tests) – No way to predict which physical index pages
– Also efficient for small, selected range are next when traversing the index for large
of key values number of key values

IBM Systems Lab Services – January 2019 Copyright 2019 IBM Corporation
Db2 for i
Encoded Vector Index (EVI)

• Elegant index design which delivers faster data access in


analytical query and reporting environments

– Advanced, patented “columnar DB” technology from IBM Research


– A more scalable variation of traditional bitmap indexing
– Complements the radix index
– Provides better metadata than the radix index
– Used to materialize a bitmap or relative row number (RRN) list
– Supports index ANDing / ORing
– Supports interesting index only access
– Easy to maintain

IBM Systems Lab Services – January 2019 Copyright 2019 IBM Corporation
Db2 for i
Encoded Vector Index (EVI)
• Object type is File, subtype is LF
• EVI is composed of two parts...
VECTOR:
Row
Key Value Code Count 1 1
SYMBOL Alabama 1 1000 17 2
TABLE: Alaska 2 450 5 3
Arizona 3 5000 9 4
Only
California 4 10000 2 5
Codes 6
Colorado 5 6500 7
... ... ... 49 7
49 8
Wisconsin 49 340
5 9
Wyoming 50 2760
... ...

• Symbol table contains information for each distinct key value


• Each distinct key value is assigned a unique code
– Code is 1, 2, or 4 bytes - depending on the actual number of distinct key values

• Rather than a bit array for each distinct key value, the index has
one array of codes (i.e. the Vector)

IBM Systems Lab Services – January 2019 Copyright 2019 IBM Corporation
Db2 for i
Creating Indexes

• CREATE INDEX SQL statement

CREATE INDEX my_index ON my_table (key1, key2 DESC)

• CREATE ENCODED VECTOR INDEX SQL statement

CREATE ENCODED VECTOR INDEX my_evi ON my_table (key1)

• Access Client Solutions – index creation wizard

• CRTPF and CRTLF CL commands


– Keyed access paths within the physical file, logical file or join logical file

• Primary Key, Foreign Key and Unique Key Constraints


– CREATE TABLE
– ALTER TABLE
– ADDPFCST

IBM Systems Lab Services – January 2019 Copyright 2019 IBM Corporation
Using Indexes Db2 for i

Index (color) Table


Key RRN RRN COLOR

black 3 1 yellow

black 17 2 red

blue 7 3 black

blue 10 4 brown

blue 12 5 red
Depending on the object brown 4 6 red

accessed, the “relative” green 8 7 blue

order of the data will green 16 8 green

appear differently orange 9 9 orange

red 2 10 blue

red 5 11 violet

red 6 12 blue

red 14 13 white

violet 11 14 red

white 13 15 white

white 15 16 green

yellow 1 17 black

yellow 18 18 yellow

… … … …

IBM Systems Lab Services – January 2019 Copyright 2019 IBM Corporation
Db2 for i
Using Indexes - Probe versus Scan

Index Key Columns (Column1,Column2,Column3)

• Probe (key positioning)


Column1 Column2 Column3
with leading, n contiguous
key columns Probe - -
1
Probe Probe -
1+2
1+2+3 Probe Probe Probe
- Scan -
• Scan (read and test)
with any key columns - - Scan
2
- Scan Scan
3
2+3

IBM Systems Lab Services – January 2019 Copyright 2019 IBM Corporation
Db2 for i
Using Indexes - Probe versus Scan
Index Key Columns (CUSTOMER, ORDER, ITEM)

CUSTOMER ORDER ITEM

• Probe (key positioning) 001 B507 AB-2700


with leading, n contiguous 001 B607 CD-2000
key columns
1 002 B100 XY-1005
1+2 002 B102 AZ-5000
1+2+3
003 B709 HH-6500
• Scan (read and test) 004 B043 HH-6500
with any key columns
2 WHERE ORDER = ‘B102‘ AND CUSTOMER = 002
3
2+3 WHERE ITEM = ‘HH-6500’

WHERE ITEM = ‘CD-2000’ AND CUSTOMER = 001

WHERE ITEM = ‘HH-6500’ OR COLOR = ‘RED’


IBM Systems Lab Services – January 2019 Copyright 2019 IBM Corporation
Db2 for i
Index Probe and Scan Method

Scan Here

IBM Systems Lab Services – January 2019 Copyright 2019 IBM Corporation
Db2 for i
Index Scan Access Method

Reads all key values from beginning to end while applying the selection
criteria to the data within the index.

• Advantages:
– Rows from the table are returned in the key order of the index
– Can be used in conjunction with an index probe

• Potential disadvantages:
– May result in random disk I/O against the table and/or the index

• When to expect them:


– Relatively few rows are needed from the index and the table
– The index matches the sequence needed for ordering or grouping
– The index has some useful keys for selection but they aren’t contiguous
positions needed for the index to be probed
– Optimization Goal is First I/O

IBM Systems Lab Services – January 2019 Copyright 2019 IBM Corporation
Db2 for i
Index Scan Example
Given an index like this: EMPLOYEE Index
CREATE INDEX empix ON Scan
LASTNAME WORKDEPT
employee (lastname, workdept)
Adamson B01
SELECT * FROM employee Anderson B01
WHERE workdept = ‘B01‘ Anderson G01
ORDER BY lastname Cain A01
Caine G01
Doe E01
Jones B01
Think of scanning the entire index, Jones C01
testing WORKDEPT for B01 Jones D01
Milligan A01
Keys and rows are accessed in
Peterson B01
LASTNAME order
Peterson F01
Smith B01
Smith C01
Smith D01
Smith F01
Wulf B01

IBM Systems Lab Services – January 2019 Copyright 2019 IBM Corporation
Db2 for i
Index Scan Example - Graph

SELECT * FROM employee


WHERE workdept = ‘B01‘
ORDER BY lastname

IBM Systems Lab Services – January 2019 Copyright 2019 IBM Corporation
Db2 for i
Index Probe Access Method

Positions into a set of keys and reads them from the index.
• Also known as “key row positioning” in some high level languages

• Advantages:
– Return the rows back in the same order as the keys in the index
– Can be used in conjunction with index scan

• Potential disadvantages:
– May result in random disk I/O against the table and/or the index

• When to expect them:


– When relatively few rows are returned from the index and the table
– When the index matches the sequence needed for ordering or grouping
– The selection columns match leading and contiguous key columns of the
index
– Optimization Goal is First I/O

IBM Systems Lab Services – January 2019 Copyright 2019 IBM Corporation
Db2 for i
Index Probe Example

Given an index like this: EMPLOYEE Table


CREATE INDEX empinx2 ON employee (workdept) RRN WORKDEPT
001 D01
SELECT * 002 E01
FROM employee 003 B01
WHERE workdept = ‘B01' 004 A01
005 G01
RRN 006 E01
EMPLOYEE Index
007 A01
Perform a probe
WORKDEPT 008 C01
into the key range
using the local A01 (004) 009 D01
selection value(s) A01 (007) 010 A01
A01 (010) 011 C01
A01 (017) 012 F01
B01 (003) Perform a
013 B01
B01 (013) probe into the 014 C01
B01 (019) table using 015 D01
C01 (008) the RRN 016 F01
value 017 A01
...

IBM Systems Lab Services – January 2019 Copyright 2019 IBM Corporation
Db2 for i
Index Probe Example - Graph

SELECT *
FROM employee
WHERE workdept = ‘B01'

IBM Systems Lab Services – January 2019 Copyright 2019 IBM Corporation
Db2 for i
Index Probe + Scan Example

Given an index like this: EMPLOYEE Index


CREATE INDEX empinx3 ON employee WORKDEPT FIRSTNAME LASTNAME
(workdept, firstname,lastname) A01 Adam Jones
A01 Kent Milligan
SELECT firstname, lastname, workdept A01 Mark Wulf
FROM employee A01 Mike Cain
WHERE workdept = 'C01‘ B01 Amy Adamson
AND lastname IN ('JONES', 'SMITH') B01 Amy Anderson Test
B01 Mike Smith
Probe C01 Doug Jones
C01 Eric Peterson
C01 Jack Smith
Probe first key WORKDEPT, D01 Davey Jones
scan (test) third key D01 Jill Smith
LASTNAME and pull out all E01 John Doe
three columns F01 John Smith
(index only access) F01 Sally Peterson
G01 Daniel Anderson
G01 Michael Caine

IBM Systems Lab Services – January 2019 Copyright 2019 IBM Corporation
Db2 for i
Index Probe + Scan Example - Graph

SELECT firstname, lastname, workdept


FROM employee
WHERE workdept = 'C01'
AND lastname IN ('JONES', 'SMITH')

IBM Systems Lab Services – January 2019 Copyright 2019 IBM Corporation
Db2 for i
Index Scan (EVI) Example

Given an EVI like this:


CREATE ENCODED VECTOR INDEX empinx4 ON
employee (workdept)

SELECT *
FROM employee
WHERE workdept = 'B01'
Vector Return RRN list
Symbol Table 1
17
Key Code
Scan 5
Binary A00 1 vector
search 9
A01 2 for
symbol code(s)
2
A04 3
table 7
B00 4
for 49
B01 5
key(s) 49
B02 6
and 5
B05 7
code(s) .
C00 8
.
... ... .
.
.
.
.

IBM Systems Lab Services – January 2019 Copyright 2019 IBM Corporation
Db2 for i
Index Scan (EVI) Example - Graph

SELECT *
FROM employee
WHERE workdept = 'B01'

IBM Systems Lab Services – January 2019 Copyright 2019 IBM Corporation
Db2 for i
Index Only Access

• Avoid or minimize I/O by reading the required data from the


index
– No need to access the underlying table
• All columns used in the query must be contained in the index
• Used with index probe or index scan
– Used frequently with radix indexes
– Used with EVI when the data is represented in symbol table
 Can help with COUNT, DISTINCT, and aggregate queries

IBM Systems Lab Services – January 2019 Copyright 2019 IBM Corporation
Db2 for i
Index Only Access Example - Graph

SELECT firstname, lastname, workdept


FROM employee
WHERE workdept = 'C01'
AND lastname IN ('JONES', 'SMITH')

The table probe is not


required since all the
columns are represented
within the index

IBM Systems Lab Services – January 2019 Copyright 2019 IBM Corporation
Db2 for i
Index Only Access with EVIs

CREATE ENCODED VECTOR INDEX sales_evi ON sales (returnflag)


:
SELECT DISTINCT returnflag FROM sales
:
SELECT COUNT(DISTINCT returnflag) FROM sales

Scan EVI symbol table,


selecting distinct key
values for
RETURNFLAG, using
index only access.

EVI Symbol Table


KEY
A
N
R

IBM Systems Lab Services – January 2019 Copyright 2019 IBM Corporation
Db2 for i
Index Only Access with EVIs

CREATE ENCODED VECTOR INDEX big_table_evi ON


BIG_TABLE (quantity, returnflag, year)
:
SELECT returnflag, SUM(quantity)
FROM big_table
WHERE year = 2016
GROUP BY returnflag

Scan EVI symbol table, selecting


data for QUANTITY,
RETURNFLAG and YEAR using
index only access.

IBM Systems Lab Services – January 2019 Copyright 2019 IBM Corporation
Db2 for i
Index Only Access with EVIs

CREATE ENCODED VECTOR INDEX big_table_evi


ON big_table (returnflag, year)
INCLUDE COUNT(*), SUM(quantity)
:
SELECT returnflag, SUM(quantity)
FROM big_table
WHERE year = 2009
GROUP BY returnflag

Scan EVI symbol table, selecting


data for SUM(QUANTITY),
RETURNFLAG and YEAR using
index only access.

IBM Systems Lab Services – January 2019 Copyright 2019 IBM Corporation
Db2 for i

LAB
Accessing Tables and Indexes

IBM Systems Lab Services – January 2019 Copyright 2019 IBM Corporation
Db2 for i

Temporary Data Structures

IBM Systems Lab Services – January 2019 Copyright 2019 IBM Corporation
Db2 for i
Temporary Data Structures
• The query engine can create, populate, and use a variety of temporary
structures.
• A temporary data structure may be used within the query for...

– DISTINCT when an index is not used


– Joining when an index is not used
– UNION or UNION ALL
– ORDER BY columns from more than one table
– GROUP BY columns from more than one table
– Grouping and ordering columns are different
– Complex view or logical file being queried
– Ordering when an index is not used
– UPDATEs with subSELECTs or subQueries
– FETCH FIRST n ROWS
– Poor indexing strategy (fewer alternatives)
– Better performance (less CPU and/or less I/O)

IBM Systems Lab Services – January 2019 Copyright 2019 IBM Corporation
Db2 for i
Temporary Data Structures

• The SQE optimizer tends to create and populate the data


structures upon the first reference
– Open time will tend to be shorter
– First fetch or first reference to the data structure will tend to be longer

• SQE may choose to reuse the temporary results


– SQE can retain both intermediate results and final results
– Only used if the query plan contains a temp data structure
– Only happens when the table has not changed and
parameter marker values are the same
– An IPL will clear the temp results
– QAQQINI option to control behavior
– SQE Plan Cache information reflects this behavior

IBM Systems Lab Services – January 2019 Copyright 2019 IBM Corporation
Db2 for i
Temporary Data Structures

Data
How the data is populated into Flow
the data structure is completely Final
independent of how the data is Result
consumed.
*Query

Temp
"Populate" Data "Consume"
Structure

*Query

Many temporary data structures


Base have to be fully populated
Table(s) Data before they can be consumed!
Flow

IBM Systems Lab Services – January 2019 Copyright 2019 IBM Corporation
Db2 for i
Advantages and Disadvantages

Temporary data structures have the following advantages


and disadvantages
• Advantages
– Temporary data structures can be populated with the smallest subset of
data needed to implement the query
– They tend to be memory resident and can reduce I/O operations
– The optimizer can cache temporary results for later reuse

• Disadvantages
– Many have to be completely populated before they can be consumed
– Because they tend to be memory resident, they generally need more
memory resources
– Most of them are not maintained and thus can contain “stale” copies of
table and index data

IBM Systems Lab Services – January 2019 Copyright 2019 IBM Corporation
Db2 for i
Implementation Consequences

The inherent advantages and disadvantages of temporary


data structures have the following consequences for query
implementation:

– Because many of them have a higher startup cost, they are more
likely to be used when the optimization goal is *ALLIO

– Because many of them generally have a larger memory footprint,


they are more likely to be used when there is a sufficient fair
share of memory and when they only need to contain a subset of
the data

– The decision to use a temporary data structure is always based


on cost. They are used when the cost of using them is lower
than the other implementation alternatives.

IBM Systems Lab Services – January 2019 Copyright 2019 IBM Corporation
Db2 for i
Temporary Data Structure Access Methods

Object Types Scan Probe

Temporary Radix Index

Hash Table

Sorted List

Unsorted List N/A

RRN List

Cache N/A

Buffer N/A

IBM Systems Lab Services – January 2019 Copyright 2019 IBM Corporation
Db2 for i
Temporary Index Creation

A temporary index is created to allow query processing that requires


access by key.

– Advantages:
 The optimizer can create the most optimal index possible
 The temporary index may be smaller because local selection can be
applied before it is created (sparse index)

– Potential disadvantages:
 It must be populated in its entirety before it can be consumed
 The index is maintained during inserts, updates and deletes
 Temporary indexes have limited life spans

– When to expect them:


 The optimizer chooses a plan that requires an index that does not exist
 Optimization Goal is First I/O

IBM Systems Lab Services – January 2019 Copyright 2019 IBM Corporation
Db2 for i
Temporary Index Example

SELECT DeptName
FROM Department D
WHERE DeptNbr = ‘F01’

Temp Index
Department Table
DeptNbr
DeptNbr DeptName ... ... ...
A01
B01 BBBB
B01
G01 GGGG
C01
Scan A01 AAAA
and D01
E01 EEEE
select E01
key C01 CCCC
F01
column
H01 HHHH
G01
F01 FFFF
H01
... ...

IBM Systems Lab Services – January 2019 Copyright 2019 IBM Corporation
Db2 for i
Temporary Index Example - Graph

SELECT DeptName
FROM Department D
WHERE DeptNbr = ‘F01’

The temporary index can be


created as the result of a
table scan, index scan or
index probe

Temporary indexes are probed and


scanned exactly like permanent
indexes.

IBM Systems Lab Services – January 2019 Copyright 2019 IBM Corporation
Db2 for i
Hash Table Creation

Use a hashing algorithm to create a data structure which


organizes/collates/relates data elements by a common (hash) value

– Advantages:
 The hash table may be smaller because local selection can be
applied before it is created

– Potential disadvantages:
 It must be populated in its entirety before it can be consumed

– When to expect them:


 Optimizer chooses a plan that requires collated data and an
appropriate index does not exist or is too expensive to create
 Optimization Goal is All I/O

IBM Systems Lab Services – January 2019 Copyright 2019 IBM Corporation
Db2 for i
Hashing Algorithm
• A well-defined method to map a larger value to a smaller
value, usually an integer which can be used as an index
into an array
– If more than one input value maps to the same output value, this
collision is detected and the values are still managed separately.
– An ideal hash algorithm minimizes the number of entries in the list of
output values and minimizes the number of collisions.
Hashing Algorithm
Input values Hash values
1
Australia
2
China 3
4
Austria
5
Greece 6
7
Denmark …

IBM Systems Lab Services – January 2019 Copyright 2019 IBM Corporation
Db2 for i
Temporary Hash Table

Orders Table
Temporary Hash Table
CUSTKEY ORDERKEY AMOUNT …
Key Orderkey Amount
000056 110010 500 …
110010 500
000056 110050 1000 …
Populate 000056 110050 1000
110240 650
000056 110240 650 …

000101 110005 1400 …


110005 1400
000101 110036 750
000101 110036 750 …
… … …
… … … …

“On Disk” “In Memory”

IBM Systems Lab Services – January 2019 Copyright 2019 IBM Corporation
Db2 for i
Temporary Hash Table
Necessary Orders columns…
Temporary Hash Table
Customers Table
Key Orderkey Amount
CUSTKEY CUSTOMER …
110010 500
000056 Cain and Co. … JOIN 000056 110050 1000
110240 650
000057 Parkinson’s Mfg …

000058 Windy City Brewing …


110005 1400
000101 110036 750
000059 Zumbro Distribution …
… … …
000060 Zyonic Logisitics …

… … …

“On Disk” “In Memory”

IBM Systems Lab Services – January 2019 Copyright 2019 IBM Corporation
Db2 for i
Temporary Hash Table (Hash Index)

Orders Table
Temporary Hash Table
CUSTKEY ORDERKEY AMOUNT …
Key RRN
000056 110010 500 …
1
000056 110050 1000 …
Populate 000056 2
3
000056 110240 650 …

000101 110005 1400 …


4
000101 5
000101 110036 750 …
… …
… … … …

“On Disk” “In Memory”

IBM Systems Lab Services – January 2019 Copyright 2019 IBM Corporation
Db2 for i
Temporary Distinct Hash Table

Aggregated data…
Orders Table Temporary Distinct Hash Table
CUSTKEY ORDERKEY AMOUNT …
Key Aggregate
000056 110010 500 …
000056 2150
000056 110050 1000 …
000057 3010
000056 110240 650 …
000101 2150
000101 110005 1400 …
… …
000101 110036 750 …

… … … …

“On Disk” “In Memory”

IBM Systems Lab Services – January 2019 Copyright 2019 IBM Corporation
Db2 for i
Hash Table Creation

Input: Join or Group By value


PART COLOR
Part M Blue Output: Hash value
Part O Green
The same input value will
Part A Purple
always result in the same
Part Z Orange hash value
Part G Blue
Part A Purple Hashing
Part B Red Algorithm Only relevant
Part B Red
information is stored
Part I Green and the data is in no
Part X Blue particular order
Part Z Orange
Part Q Yellow Collect all common input
Part A Purple values into a common
Part C Yellow bucket (by hash value)
Part A Purple
Part B Red
Part I Green

Selection can be performed using


any method available, including
parallel enabled methods Oranges Purples Reds Greens Blues Yellows

IBM Systems Lab Services – January 2019 Copyright 2019 IBM Corporation
Db2 for i
Distinct Hash Table Example

EMPLOYEE Table SELECT FirstName, SUM(Days)


Scan WORKDEPT FROM EMPLOYEE
B01 WHERE WORKDEPT BETWEEN 'A01' AND 'E01'
B01 GROUP BY FirstName
G01
A01
G01
E01
A01
C01 Hashing Hash Table Data Structure
D01 Algorithm ... FIRSTNAME
Kent
... ... ...

A01 Mike
Mark
C01 Amy
F01 Bob
Tom
B01 Jane
C01 Michael
Janice
D01 Sally
F01 ...

A01

IBM Systems Lab Services – January 2019 Copyright 2019 IBM Corporation
Db2 for i
Hash Table Example - Graph

SELECT FirstName, SUM(Days)


FROM EMPLOYEE
WHERE WORKDEPT BETWEEN 'A01' AND 'E01'
GROUP BY FirstName

The hash table can be


created as the result of a
table scan, index scan or
index probe, or any
combination of methods

IBM Systems Lab Services – January 2019 Copyright 2019 IBM Corporation
Db2 for i
Hash Table Probe - Example

Read a row, and produce a


"key" value to be used for
probing the hash table... All data
associated
'Blue' Hashing with 'Blue'
Algorithm
Selection can be
performed using any
method available,
including parallel enabled
methods Probe

Hash
Table
Oranges Purples Reds Greens Blues Yellows

IBM Systems Lab Services – January 2019 Copyright 2019 IBM Corporation
Db2 for i
Hash Table Probe - Graph

SELECT E.LastName, E.WorkDept, D.DeptName


FROM Employee E INNER JOIN Department D
ON E.WorkDept = D.DeptNbr

Any method(s) can be used


to access rows on the left
side of the join.
Any method(s) can be used
to populate the hash table.

IBM Systems Lab Services – January 2019 Copyright 2019 IBM Corporation
Db2 for i
Hash Table Scan - Example

After the hash table is populated using any method(s),


unload the information in the hash table.
ƒ The data is not returned in any particular order.
All data
All data associated
All data associated with 'Yellow'
All data associated with 'Blue'
All data associated with 'Green'
All data associated with 'Red'
associated with 'Purple'
with 'Orange'

Scan

Hash
Table
Oranges Purples Reds Greens Blues Yellows

IBM Systems Lab Services – January 2019 Copyright 2019 IBM Corporation
Db2 for i
Hash Table Scan - Graph

SELECT WorkDept, SUM(Days)


FROM Employee
WHERE WorkDept IN ('A01', 'C01', 'E01')
GROUP BY WorkDept

Any method(s) can be used


to select the rows and
populate the hash table
including parallel methods.

IBM Systems Lab Services – January 2019 Copyright 2019 IBM Corporation
Db2 for i
“Skinny” Hash Table

Used when fair share is too small to contain entire hash table
and there is no permanent index available for access

Must read the


table to obtain
more data

Hash table
contains minimum
number of
columns

Index used to
minimize memory
footprint during
I/O
Adequate fair share Inadequate fair share

IBM Systems Lab Services – January 2019 Copyright 2019 IBM Corporation
Db2 for i
Sorted List Creation

A sort routine is used to create a sorted list to order or sequence data


with a common value together.

– Advantages:
 The sorted list can be populated with just the subset of rows required
to satisfy the query

– Potential disadvantages:
 The data is processed twice, first to populate the list and then to sort
the values
 It has to be populated in its entirety before it can be consumed

– When to expect them:


 The data has to be ordered to support ordering or distinct processing
 An appropriate index is not available
 Optimization Goal is All I/O

IBM Systems Lab Services – January 2019 Copyright 2019 IBM Corporation
Db2 for i
Sorted List Example

SELECT *
Index X1 FROM EMPLOYEE
LASTNAME WORKDEPT Scan WHERE WORKDEPT BETWEEN 'A01' AND 'E01'
Adamson B01 ORDER BY FirstName
Anderson B01
Anderson G01
Cain A01
Caine G01 Sort
Doe E01 Routine
Jones A01
Jones C01 Sorted List Data Structure
Jones D01 ... FIRSTNAME ... ... ...
Milligan A01 Amy
Peterson C01 Bob
Peterson F01 Jane
Janice
Smith B01 Kent
Smith C01 Mark
Smith D01 Michael
Smith F01 Mike
Sally
Wulf A01 Tom
...

IBM Systems Lab Services – January 2019 Copyright 2019 IBM Corporation
Db2 for i
Sorted List Example - Graph

SELECT *
FROM EMPLOYEE
WHERE WORKDEPT BETWEEN 'A01' AND 'E01'
ORDER BY FirstName

The sorted listed can be


created as the result of a
table scan, index scan or
index probe, or any
combination of methods

IBM Systems Lab Services – January 2019 Copyright 2019 IBM Corporation
Db2 for i
Sorted List Scan Example

SELECT *
FROM EMPLOYEE
WHERE WorkDept IN ('A01', 'C01', 'E01')
ORDER BY FirstName

Sort
Scan
Routine
Sorted List Data Structure
... FIRSTNAME ... ... ...
Amy
Bob
Jane
Janice
Kent
Mark
Michael
Mike
Sally
Tom
...

IBM Systems Lab Services – January 2019 Copyright 2019 IBM Corporation
Db2 for i
Sorted List Scan Example - Graph

SELECT *
FROM EMPLOYEE
WHERE WorkDept IN ('A01', 'C01', 'E01')
ORDER BY FirstName

The sorted listed is


scanned and the data
is returned in order

IBM Systems Lab Services – January 2019 Copyright 2019 IBM Corporation
Db2 for i
Sorted List Probe Example

SELECT * FROM
EMPLOYEE E INNER JOIN DATE_MASTER D
ON E.HIRE_DATE >= D.DATEKEY

Sort
Routine Sorted List Data Structure
... HIRE_DATE ... ... ...
2003/12/30
2003/12/31
2004/01/05
2004/01/05
2004/01/08
2004/01/10
2004/01/12
Probe
2004/01/12
2004/01/14
2004/01/15
...

IBM Systems Lab Services – January 2019 Copyright 2019 IBM Corporation
Db2 for i
Sorted List Example - Graph

SELECT * FROM
EMPLOYEE E INNER JOIN DATE_MASTER D
ON E.HIRE_DATE >= D.DATEKEY

The sorted listed is


probed to find a
matching value

IBM Systems Lab Services – January 2019 Copyright 2019 IBM Corporation
Db2 for i
Unsorted List Creation

• An unsorted list is created to store data for an intermediate


operation.
– It is the SQE equivalent to using a temporary table but much faster

– Advantages:
 The most efficient operation(s) can be used for selecting the data
 The unsorted list contains a subset of the data

– Potential disadvantages:
 No order or key associated with the data
 Must be populated in its entirety before it can be consumed.

– When to expect them:


 The query contains complex operations that requires multiple steps to
complete the processing
 Optimization Goal is All I/O

IBM Systems Lab Services – January 2019 Copyright 2019 IBM Corporation
Db2 for i
Unsorted List Example

SELECT D.DeptName, E.WorkDept, SUM(E.Days)


FROM Employee E INNER JOIN Department D
ON E.WorkDept = D.DeptNbr
GROUP BY D.DeptName, E.WorkDept

Build an unsorted list


before joining

Unsorted List
DeptName WorkDept Days
BBBB B01 5
BBBB B01 22
Select GGGG G01 7
AAAA A01 0
required rows
GGGG G01 4
EEEE E01 7
AAAA A01 14
CCCC C01 19

IBM Systems Lab Services – January 2019 Copyright 2019 IBM Corporation
Db2 for i
Unsorted List Example - Graph

SELECT D.DeptName, E.WorkDept, SUM(E.Days)


FROM Employee E INNER JOIN Department D
ON E.WorkDept = D.DeptNbr
GROUP BY D.DeptName, E.WorkDept

Any technique can be


used to populate the
unsorted list

IBM Systems Lab Services – January 2019 Copyright 2019 IBM Corporation
Db2 for i
Unsorted List Scan Example

SELECT D.DeptName, E.WorkDept, SUM(E.Days)


FROM Employee E INNER JOIN Department D
ON E.WorkDept = D.DeptNbr
GROUP BY D.DeptName, E.WorkDept

Build an unsorted list


before grouping
Scan
Unsorted List
DeptName WorkDept Days
BBBB B01 5
BBBB B01 22
GGGG G01 7
AAAA A01 0
GGGG G01 4
EEEE E01 7
AAAA A01 14
CCCC C01 19

IBM Systems Lab Services – January 2019 Copyright 2019 IBM Corporation
Db2 for i
Unsorted List Scan Example - Graph

SELECT D.DeptName, E.WorkDept, SUM(E.Days)


FROM Employee E INNER JOIN Department D
ON E.WorkDept = D.DeptNbr
GROUP BY D.DeptName, E.WorkDept

Read thru the unsorted list


and perform follow on
processing

IBM Systems Lab Services – January 2019 Copyright 2019 IBM Corporation
Db2 for i
RRN List Creation

• Generate a list of relative row numbers (RRNs) that represent


the rows matching the local selection.

– Advantages:
 The most efficient index(es) can be used for selecting the rows
 Multiple RRN lists can be ANDed / ORed together, further reducing the need
to probe the table
 No I/Os to the table during generation of the RRN list
 The order of the RRNs is based on the physical order of the rows in the table

– Potential disadvantages:
 It must be populated in its entirety before it can be consumed

– When to expect them:


 One or more indexes match the local selection
 Optimization Goal is All I/O

IBM Systems Lab Services – January 2019 Copyright 2019 IBM Corporation
Db2 for i
RRN List Example

Given an index on table EMPLOYEE keyed on STATE...

...WHERE STATE = 'Iowa'...

Binary
search
ROOT
index
for Return RRN list
key(s) Test
and Node MISS
row
no(s)
AR
ISSIPPI OURI
002 003
IOWA
004
IZONA KANSAS
005 001

IBM Systems Lab Services – January 2019 Copyright 2019 IBM Corporation
Db2 for i
RRN List Example

Given an EVI on table EMPLOYEE keyed on WORKDEPT...

...WHERE WORKDEPT IN ( 'B01', C01, 'E01')

Vector Return RRN list


Symbol Table 1
17
Key Code
Scan 5
Binary A00 1 vector
search 9
A01 2 for
symbol code(s)
2
A04 3
table 7
B00 4
for 49
B01 5
key(s) 49
B02 6
and 5
B05 7
code(s) .
C00 8
.
... ... .
.
.
.
.

IBM Systems Lab Services – January 2019 Copyright 2019 IBM Corporation
Db2 for i
RRN List Example - Graph

SELECT *
FROM EMPLOYEE
WHERE WORKDEPT IN ( 'B01', C01, 'E01')

SQE uses a variation on


table probe called
clustered I/O

IBM Systems Lab Services – January 2019 Copyright 2019 IBM Corporation
Db2 for i
Index ANDing / ORing Example 1
State Workdept SELECT *
EVI FROM EMPLOYEE
Radix WHERE STATE = ‘IOWA'
OR WORKDEPT IN ( 'B01', C01, 'E01')

Intermediate Intermediate Final


RRN list RRN list RRN list

2
4
4 2 8
Represents
1001 8 1001 all the local
1005 OR 1004 1004 selection
3051 (Merge) 4105 1005
3051
4105
State Workdept

IBM Systems Lab Services – January 2019 Copyright 2019 IBM Corporation
Db2 for i
Index ANDing / ORing Example 2

Country Workdept SELECT *


EVI EVI FROM EMPLOYEE
WHERE COUNTRY = 'France'
AND WORKDEPT IN ( 'B01', C01, 'E01')

Intermediate Intermediate Final


RRN list RRN list RRN list

3 3
5 7
10 10
3
15 27 Represents
10 all the local
1000 AND 1000
1000 selection
1005 1010
(Merge) 3001
1007 2035
3001 3001
3050 4100
Country Workdept Result

IBM Systems Lab Services – January 2019 Copyright 2019 IBM Corporation
Db2 for i
Index ANDing / ORing Example - Graph

SELECT *
FROM EMPLOYEE
WHERE STATE = 'Iowa'
OR WORKDEPT IN ( 'B01', 'C01', 'E01')

IBM Systems Lab Services – January 2019 Copyright 2019 IBM Corporation
Db2 for i
Index ANDing / ORing Example - Graph

SELECT *
FROM EMPLOYEE
WHERE STATE = 'Iowa'
OR WORKDEPT IN ( 'B01', 'C01', 'E01')

IBM Systems Lab Services – January 2019 Copyright 2019 IBM Corporation
Db2 for i
RRN List Scan Example

Given separate indexes on table EMPLOYEE keyed on Position and Salary

SELECT * FROM EMPLOYEE RRN List IX_Salary


WHERE Position = 'Mgr'
OR Salary BETWEEN 50000 AND 100000

Final
Salary BETWEEN 50000 AND 100000
RRN List

RRN List Scan

RRN List EVI_Position


EMPLOYEE

Position = 'Mgr'

IBM Systems Lab Services – January 2019 Copyright 2019 IBM Corporation
Db2 for i
RRN List Scan Example - Graph

SELECT * FROM EMPLOYEE


WHERE Position = 'Mgr'
OR Salary BETWEEN 50000 AND 100000

SQE uses a variation on


table probe called
clustered I/O

IBM Systems Lab Services – January 2019 Copyright 2019 IBM Corporation
Db2 for i
Cache Creation

• A cache is created to store data for an intermediate operation.

– Advantages:
 The most efficient operation(s) can be used for selecting the data
 The cache contains a subset of the data
 Allows subsequent methods to access the cache instead of the
database object

– Potential disadvantages:
 Only effective in caching repetitive values that don't change often

– When to expect them:


 The query is accessing the same set of values over and over
 Works well when there is a relatively small set of values (low
cardinality)

IBM Systems Lab Services – January 2019 Copyright 2019 IBM Corporation
Db2 for i
Cache Example

SELECT D.DeptName, E.WorkDept


FROM Employee E LEFT OUTER JOIN Department D
ON E.WorkDept = D.DeptNbr
WHERE WorkDept BETWEEN 'A01' AND 'E01'

Populate cache
as rows are
selected
Cache

WorkDept
B01
A01
Select D01
required rows E01
C01

IBM Systems Lab Services – January 2019 Copyright 2019 IBM Corporation
Db2 for i
Cache Example - Graph

SELECT D.DeptName, E.WorkDept


FROM Employee E LEFT OUTER JOIN Department D
ON E.WorkDept = D.DeptNbr
WHERE WorkDept BETWEEN 'A01' AND 'E01'

Any technique can be


used to populate the
Cache

IBM Systems Lab Services – January 2019 Copyright 2019 IBM Corporation
Db2 for i
Cache Probe Example

SELECT D.DeptName, E.WorkDept


FROM Employee E LEFT OUTER JOIN Department D
ON E.WorkDept = D.DeptNbr
WHERE WorkDept BETWEEN 'A01' AND 'E01'

Probe cache as
rows are
selected Cache

WorkDept
A01 Probe
B01
C01
D01
E01

IBM Systems Lab Services – January 2019 Copyright 2019 IBM Corporation
Db2 for i
Cache Example - Graph

SELECT D.DeptName, E.WorkDept


FROM Employee E LEFT OUTER JOIN Department D
ON E.WorkDept = D.DeptNbr
WHERE WorkDept BETWEEN 'A01' AND 'E01'

Any technique can


be used to populate
the Cache

IBM Systems Lab Services – January 2019 Copyright 2019 IBM Corporation
Db2 for i
Buffer Creation

• A buffer is created to store data for an intermediate operation.

– Advantages:
 The most efficient operation(s) can be used for selecting the data
 The buffer contains a subset of the data
 Allows subsequent methods to run in parallel

– Potential disadvantages:
 No order or key associated with the data

– When to expect them:


 The query is running parallel degree > 1 and the buffer is used to feed
the parallel processes
 Optimization Goal is All I/O

IBM Systems Lab Services – January 2019 Copyright 2019 IBM Corporation
Db2 for i
Buffer Example

SELECT D.DeptName, E.WorkDept


FROM Employee E INNER JOIN Department D
ON E.WorkDept = D.DeptNbr
WHERE D.DeptNbr BETWEEN 'A01' AND 'E01'

Build buffer
before parallel join

Buffer
WorkDept
A01
A01
Select
A01
required B01
rows B01
C01
C01
C01

IBM Systems Lab Services – January 2019 Copyright 2019 IBM Corporation
Db2 for i
Buffer Example - Graph

SELECT D.DeptName, E.WorkDept


FROM Employee E INNER JOIN Department D
ON E.WorkDept = D.DeptNbr
WHERE D.DeptNbr BETWEEN 'A01' AND 'E01'

Any technique can be


used to populate the
buffer

IBM Systems Lab Services – January 2019 Copyright 2019 IBM Corporation
Db2 for i
Buffer Scan Example

SELECT D.DeptName, E.WorkDept


FROM Employee E INNER JOIN Department D
ON E.WorkDept = D.DeptNbr
WHERE D.DeptNbr BETWEEN 'A01' AND 'E01'

Build buffer
before parallel join

Buffer Scan
WorkDept
B01
B01
Select A01
required A01
rows B01
E01
A01
C01

IBM Systems Lab Services – January 2019 Copyright 2019 IBM Corporation
Db2 for i
Buffer Scan Example - Graph

SELECT D.DeptName, E.WorkDept


FROM Employee E INNER JOIN Department D
ON E.WorkDept = D.DeptNbr
WHERE D.DeptNbr BETWEEN 'A01' AND 'E01'

Read through the


buffer and feed a
parallel process

IBM Systems Lab Services – January 2019 Copyright 2019 IBM Corporation
Db2 for i
Temporary Data Structure - Delays

The clock in the icon


indicates a potential
delay in execution

IBM Systems Lab Services – January 2019 Copyright 2019 IBM Corporation
Db2 for i

Other Interesting Nodes

IBM Systems Lab Services – January 2019 Copyright 2019 IBM Corporation
Db2 for i
Nested Loop Join - Graph

IBM Systems Lab Services – January 2019 Copyright 2019 IBM Corporation
Db2 for i
Aggregation - Graph

IBM Systems Lab Services – January 2019 Copyright 2019 IBM Corporation
Db2 for i
Union and Union All - Graph

IBM Systems Lab Services – January 2019 Copyright 2019 IBM Corporation
Db2 for i
Fetch N Rows, Lock Row - Graph

IBM Systems Lab Services – January 2019 Copyright 2019 IBM Corporation
Db2 for i
Logic – Early Exit

• A Logic node indicates that the optimizer is


attempting to avoid all or part of the query
execution based on values in the statement
• Early exits are employed for:
• Nonsense queries (WHERE 1=0)
• Constraint awareness
• Host variable comparisons
(col1 BETWEEN ? AND ?)
– Logic node is processed before:
 Any tables or indexes are paged into
memory
 Any temporary data structures are
populated

IBM Systems Lab Services – January 2019 Copyright 2019 IBM Corporation
Db2 for i
Logic – Before and After i 7.2
• Starting with IBM i 7.2, logic nodes are shown in line with the corresponding
part of the Visual Explain. They have initial processing which is implemented
from the top down and which still avoids unnecessary processing.

IBM Systems Lab Services – January 2019 Copyright 2019 IBM Corporation
Db2 for i
User Defined Table Functions

IBM Systems Lab Services – January 2019 Copyright 2019 IBM Corporation
Db2 for i
Complicated

• Complicated icons indicate functionality that is not


directly implemented in SQL
• This example is from a user defined function call:

SELECT custname, discount(totsales)


FROM ordersum ;

IBM Systems Lab Services – January 2019 Copyright 2019 IBM Corporation
Db2 for i
Values List

• The Values List appears when the VALUES


clause is used in a SELECT or a VIEW:

SELECT loc_id, loc_name FROM TABLE


(VALUES(1,'Fargo'),(2,'Bismarck'))
AS locations(loc_id, loc_name) ;

IBM Systems Lab Services – January 2019 Copyright 2019 IBM Corporation
Db2 for i
Rank

• The Rank icon appears when the RANK or


DENSE_RANK function is used

SELECT sales_person, sales_total,


RANK() OVER (ORDER BY sales_total DESC)
as sales_rank
FROM sales ;

IBM Systems Lab Services – January 2019 Copyright 2019 IBM Corporation
Db2 for i
Temporary Distinct Sorted List

• The Temporary Distinct Sorted List is a temporary


distinct hash table with an index so it can be accessed
in sequence.
• It is used to support ROLLUP:

SELECT sales_person, SUM(sales_total) FROM sales


GROUP BY ROLLUP(sales_person) ;

IBM Systems Lab Services – January 2019 Copyright 2019 IBM Corporation
Db2 for i
Enqueue / Dequeue

• Enqueue and dequeue are


used for recursive query
implementations
• Recursive Common Table
Expressions
• Hierarchical CONNECT BY

IBM Systems Lab Services – January 2019 Copyright 2019 IBM Corporation
Db2 for i

LAB
Accessing Temporary Data Structures

IBM Systems Lab Services – January 2019 Copyright 2019 IBM Corporation

You might also like