Hash Joins - Implementation and Tuning
Hash Joins - Implementation and Tuning
Hash Joins - Implementation and Tuning
Technical Report
Hash Joins
Implementation and Tuning Release 7.3
Page 1-1
TABLE OF CONTENTS
INTRODUCTION .........................................................................................................................................................3 Audience....................................................................................................................................................................3 Document Overview..................................................................................................................................................3 Concepts ....................................................................................................................................................................3 Conventions Used in this Paper .............................................................................................................................4 NESTED LOOPS JOIN (NLJ) ......................................................................................................................................4 SORT MERGE JOIN (SMJ)..........................................................................................................................................5 PARALLELISM IN JOIN EXECUTION .....................................................................................................................6 HASH JOINS OVERVIEW ..........................................................................................................................................7 Parallel Hash Joins.....................................................................................................................................................8 Anti-Joins and Outer Joins.........................................................................................................................................8 HASH JOIN PRINCIPLES ...........................................................................................................................................8 HASH JOINS ALGORITHM......................................................................................................................................12 COSTING OF HASH JOINS ......................................................................................................................................13 In-Memory Hash Join ..............................................................................................................................................13 On-Disk Hash Join...................................................................................................................................................13 HASH JOINS PERFORMANCE TUNING ................................................................................................................14 Minimum Setting for HASH_AREA_SIZE.............................................................................................................14 HASH_AREA_SIZE and Tablespace I/O ...............................................................................................................16 HASH_MULTIBLOCK_IO_COUNT.....................................................................................................................17 Size of Join Inputs....................................................................................................................................................17 Reliability of Predicting Relative Sizes of Two Join Inputs.....................................................................................17 Exploiting Interesting Orderings..............................................................................................................................18 Number of Distinct Values ......................................................................................................................................18 Skew in Build and Probe Inputs...............................................................................................................................18 Other Performance Considerations ..........................................................................................................................18 TROUBLESHOOTING ..............................................................................................................................................19 APPENDICES .............................................................................................................................................................20 Appendix A Citations ...........................................................................................................................................20 Appendix B Enabling Hash Joins..........................................................................................................................20
Copyright Oracle Corporation 1997 All rights reserved. Printed in the U.S.A. ATTENTION READERS INCLUDING ORACLE EMPLOYEES: NO PART OF THIS DOCUMENT MAY BE REPRODUCED, IN ANY FORM, WITHOUT THE PERMISSION OF THE AUTHORS. Authors: Prabhaker GP Gongloor, Sameer Patkar
Oracle is a registered trademark of Oracle Corporation. Oracle7 is a trademark of Oracle Corporation. All other products or company names are used for identification purposes only, and may be trademarks of their respective owners.
Page 1-2
INTRODUCTION
Audience
You are an Oracle professional interested in the underlying internal mechanisms used in sorts and high performance query processing. This report is written with the assumption that you have a solid understanding of Release 7.1 concepts. The following reports are recommended preparatory reading: Tuning Large Sorts and Asynchronous Reads in Technical Reports, Compendium Volume I, 1996, published by the Center of Expertise.
Document Overview
Extensive academic research indicates that a hash join algorithm is often the best choice for many join queries. In our preliminary research, we observed hash joins performing two to three times faster than sort merge joins in an idealized test environment. In our preliminary research we tested hash joins using a small 10 MB table and a large table, which was at least ten times greater than the small table. Our research is ongoing and may cover the following: table size thresholds when hash joins perform better than sort merge joins, scalability of hash joins compared to sort merge joins or indexed sort merge joins, anti-joins, and so on. Research findings will be published at a later date. We caution readers about applying results observed in a test environment to their own environment. Different configurations with varying workloads will have different behaviors and performance profiles. Nonetheless, we are confident that appropriate use of information in this paper will help users significantly improve sort runtimes. Furthermore, we encourage readers to use their own development environments to test hash joins. This document is organized to facilitate understanding hash joins and is meant to be read from start to finish. After this introductory section, we briefly describe existing join methods as preparation for understanding the newer hash joins method. The hash join part of the report start with an overview, followed by several more detailed descriptive sections. To close out the report, we provide tips on performance tuning and troubleshooting.
Concepts
A join method defines the algorithm used to perform a join. For Release 7.3 and earlier, the nested loop join and sortmerge join methods are available. These join methods are described in this paper as preparation for understanding the hash joins method introduced in Release 7.3, which is a significant enhancement in query processing capabilities. We also describe implementation, usage best practices, performance characteristics, tuning, and, guidelines for problem diagnosis. A join returns rows created using columns from more than one table. A query with multiple tables in the FROM clause will perform one or more join operations. The Oracle server only performs one join operation at a time. A join operation reads a left and right row source, known as a pair, and combines row pairs based on a join condition. If more than two tables must be joined, then pairing occurs successively until all tables are joined. Normally, a join condition is specified in the WHERE clause. (Absence of a WHERE clause will produce a Cartesian product of all rows from all tables joined.) Joins are classified into different topologies depending on the join predicates in the query; for example, star joins, chain, cycle, Cartesian product, and so on. Note that the cost based optimizer must be enabled to use the hash joins method. The Cost Based Optimizer (CBO) chooses a query execution plan by determining which plan is most efficient. Costs are determined based on an estimation of the resources consumed in terms of disk I/O, CPU, and network I/O. Each object accessed in the plan is costed using a model that is based on access methods, statistics, and initialization parameters. Object statistics include object size, cardinality, clustering factor of indexes, and so on. Initialization parameters include sort area size, hash area size, multi-block read factor, and so on. The optimizer also factors in any hints supplied by the user. When costing joins, the join cardinality and join method costs are used for each join permutation considered.
Page 1-3
Note the use of the letter a to denote the join column, and, letters S and B to denote the join tables. Let the size of the tables be rS=ROWS(S) and rB=ROWS(B), such that, rS rB. If valid statistics were collected for the join tables using ANALYZE commands, then the cost based optimizer will correctly choose the join order to be (S,B), and not (B,S). The bigger table, B, is known as the inner table. The smaller table, S, is known as the outer table. S is also known as the driving table, the reason for which will become clear later in the paper. Initialization parameters are denoted by small upper case letters; for example, DB_BLOCK_SIZE.
Read(B) ]
The cost of a nested loops join can be prohibitive for large tables. The number of comparisons can be significantly reduced when a B*tree index exists on the bigger inner table. This variation is known as an indexed nested loops join. An indexed nested loops join can be used for any join condition equality, less than, greater than, etc. including no join condition (resulting in a Cartesian product). The performance of an indexed nested loops join is reasonably good for resultant sets of low cardinality. An indexed-NLJ performs well when columns are retrieved by an index look-up without accessing the big table. However, if the B*tree index is several levels deep, or possibly fragmented, then the indexed-NLJ is usually not efficient due to the need to access large number of intermediate nodes in the B*tree. In Oracle, the cost of accessing intermediate nodes of the index may be somewhat alleviated because of the buffer cache. If the buffer cache is maintaining recent versions of index pages, this reduces the I/O on intermediate nodes. Another issue worth consideration is the space required for the index. The space needed by an index may be very large, and sometimes is comparable to the size of the table. Finally, the indexed nested loops join method may not be possible because only the leading columns in an index can be referenced in a query.
Page 1-4
The efficiency of Merge(S,B) depends on memory allocated SORT_AREA_RETAINED_SIZE in a non-MTS environment involves reading-writing sort runs from-to temporary segments, and, the cost of comparing the records. The sort merge join method does not require an index on the joined columns. Thus, the dominant costs in the sort merge join are sorting, reading the tables, and the I/O reading-writing sort runs from-to temporary tablespace. The sort merge join can be used for any join condition equality, less than, greater than, etc. including no join condition (resulting in a Cartesian product). The sort merge join performs reasonably well for large data sets and the performance is largely governed by size of the sort required in relation to available memory, SORT_AREA_SIZE. Sort performance is non-linear with respect to [7] sort set sizes .
Page 1-5
Page 1-6
Page 1-7
Before Release 7.3, the above query would use a nested loops join. As discussed earlier in this paper, the performance of a nested loops join can be bad for large datasets when compared to other join algorithms. Oracle Release 7.3 supports hash joins and sort merge joins for the anti-join queries. The initialization parameter is ALWAYS_ANTI_JOIN and the available hints are MERGE_AJ or HASH_AJ.
Page 1-8
High fan-out can produce a large number of small partitions, resulting in inefficient I/O. At the other extreme, small fan-out can produce a small number of large partitions, which will not fit in hash memory. Finding the optimum fanout and partition size is the key to optimum performance. Writing partitions during the partitioning phase and reading partitions during the join phase constitute a significant cost for the whole hash join operation. There is an optimal cluster size that balances efficient I/O against large fan-out within the constraints of available memory. A hash function on the join column uniquely separates the rows from tables S and B into disjoint buckets, or partitions. Oracle uses an internal hash function that tries to minimize data skew among different buckets. For simplicity in this example, we use modulo as the hash function:
MOD( join_column_value, 10)
Using the above hash function produces ten disjoint partitions for the two join tables as shown in Figure 1. For the join, the keys must be compared which fall in the partitions Si and Bj where i = j.
PARTITION VALUES
S0 0,0,10, 10
S1 1,1,1, 1,11
S2 2,2,2, 2,2,2
S3 3
S4 NULL
S5 NULL
S6 NULL
S7 NULL
S8 8
S9 9,9,9
R0
10
R1
1,1,1
R2
NULL
R3
3,3
R4
4,4,4,4
R5
R6
NULL
R7
NULL
R8
8,8,8,8
R9
NULL
Figure 1: Reduced Search Space through partitioning the build and probe inputs in Hash Joins
Page 1-9
Furthermore, if either of the partitions Si and Bj is NULL, then the other corresponding partition can be ignored. As shown, in Figure 1 below, it is only necessary to join those buckets marked . Hence, the number of key comparisons is minimized through value-based partitioning. In contrast, nested loop joins and sort-merge joins must compare all tuples from both tables, even tuples that are not in the final output. Obviously, this can be very expensive for some queries. A bitmap vector of the unique column values of the build input is created as the build input is read into hash area memory for partitioning. The bitmap memory comes from the hash area and can take up to five percent of HASH_AREA_SIZE. In this example, the following bitmap vector will be built:
{ 1, 3, 4, 4, 5, 8, 10 }
The bitmap vector is used during the partitioning phase of larger input, B, to determine whether the row read is needed for the join, and, if not, it is discarded. If any of the partition for Si is filled during the table scan, then that partition is scheduled to be written out as temporary segments to disk. The I/O is performed asynchronously. At the end of the table scan of S, a hash table is built from the maximum partitions of S that can be accommodated in the available memory. Note that available memory must allow for 15 to 20 percent overhead for managing the bitmap vector, hash table, partition table, and so on. After the full table scan of S, the big table, B, is read. (Table B will be used the probe input as described below.) As table B is read, the join column value is compared with the bitmap vector. If the bitmap vector contains the join value, then the row from B is preserved and partitioned into the appropriate bucket. Otherwise, the row from B is discarded. In this example, the following rows in B are discarded when being read:
{ 0, 0, 2, 2, 2, 2, 2, 2, 9, 9, 9, 9, 9 }
The method of using a bitmap to eliminate rows during the partitioning phase is known as bit-vector filtering. The overall cost of the hash join method may be greatly reduced by the initial eliminations. This is particularly true when the join cardinality is low relative to the sizes of the join tables, and, when the big input has a large number of rows to be joined with the small input. If the join value from B is in the bit-vector and falls into any of the partitions of B already in memory, then the join is performed and the results is returned. Otherwise, the relevant select-list columns are written out to temporary segments. After the first pass of B, the maximum number of unprocessed partitions of S are read from temporary segments into memory and a hash table is built. The previously read partitions of B are re-read to perform an in-memory join. After the first pass of S and B is completed, when the next set of partitions Si and Bi must be read from disk, the sizes of the Si and Bi partitions are considered before determining the join order. The smaller of the two is used as the build input because it has a better chance of fitting into the available memory. This is known as dynamic role reversal when the build inputs are swapped. The hash join algorithm terminates when all partitions Si and Bi are processed. A histogram driven hybrid hash join keeps tracks of the partition sizes, the data distribution within the partitions at runtime, and, chooses partitions as build input that would minimize the I/O in the later stages of the algorithm. The Oracle hash join algorithm uses the hybrid hash join method with bit-vector filtering and dynamic role reversal. Figure 2 on the next page illustrates the various steps in the hash join algorithm.
Page 1-10
1. Determine Fanout as M / C Tune Fanout to account for bit-vector memory, PQ buffer, asynch I/O, etc. 2. Read S, hash into partition using hash function on join key, and, generate hash value for next level and store. 3. Build bitmap for unique join keys.
P1
P3
P4
P5
4. Put row into empty slot. If C11 no space in current slot, or, no more empty slots, schedule C12 writes to disk. 5. If end of S, try to find smallest N partitions in B to perform probe with B. Write (Fanout - N) partitions to disk. 6. Build hash table from N partitions of B in memory.
LEGEND M : hash area memory P1 ... P5 : Partitions C11 C54 : Cluster in partition. An in-memory cluster is a slot. The following condition must be met: (#slots) * Cluster size > M - overhead Worst case scenario is no less than 6 slots.
7. Read probe B, filter probe through bit-vector, if join value not present, discard row. 8. For filtered probe, use same hash function as before and find the partition. If row falls into partition in memory, perform join, else, put on disk. 9. If end of B, read(S,B) partition pairs from disk. Use smaller input for creating hash table. (Dynamic role reversal.) 10. Iterate over probe. If build prove is 10 times memory, use nested loop hash join.
C13 C14
Disk
Page 1-11
where Favm is the usable fraction of hash area and C is the cluster size. (See page 14.) Fan-out Calculation takes into account overhead required for partition, hash tables, bitmap vector filter, some extra buffers required for asynchronous I/O and parallel query. (Calculating optimal fan-out is described in section On-Disk Hash Join on page 13.) WHILE ( ROWS IN S ) LOOP Step 2 Read the join column value and select-list items from S. Use an internal hash function, say HASH_FUN_1, and map the join column value to a partition. Also generate a hash value for the next level using another hash function, say HASH_FUN_2, and store it with the join key. The second hash function value will be used to build a hash table in the later part of the algorithm Step 3 Build bitmap vector for all unique keys in build input. Step 4 If no space in the partition, write the partition to disk. END LOOP Step 5 Try to fit the maximum number of possible partitions in memory from which a Hash table can be built and flush the rest to disk Step 5.1 Sort partitions by size. Step 5.2 Choose smallest number of partitions that fit in memory. Step 5.3 Schedule disk writes on other (Fan-out - X) number of partitions. Step 6 Build hash table from X partitions of S, using hash value already calculated for next level using HASH_FUN_2. WHILE ( ROWS TO BE READ FROM B ) LOOP Step 7.1 Filter rows using bit-vector. Step 7.2 Hash the rows that pass the bit-vector filter into the appropriate partition using join key and internal HASH_FUN_1. Step 7.3 IF ELSE END LOOP Step 8 Read (S,B) unprocessed partition pairs from disk. Use smaller input to build hash table and larger one for probe. In building the hash table we use internal HASH_FUN_2 value. This results in dynamic role reversal leading to zigzag execution trees. On the first hashed row falls into partitions in memory perform join by applying internal HASH_FUN_2 value and traversing the appropriate hash bucket write to disk to the appropriate partition of S (we do keep track of Ri and Si pairs on disk through the partition table) the HASH FN2 value, the join keys, and select list items
Page 1-12
iteration, the optimizer should select the smaller table to be the build input and the larger table to be the probe. Role reversal only takes place after the first iteration. Step 9 If the smaller of the two inputs is too large and does not fit in memory then read the smaller build input into memory in chunks and iterate over the probe. This is called the nested hash loops join.
END OF HASH JOIN ALGORITHM
Eliminating the CPU costs, which is orders of magnitude cheaper compared to disk I/O in terms of time, the dominant cost of hash joins is:
Cost(HJ) Read(S) + Read(B)
) + Cost(HJ
iteration 2
Fan-out, F, the number of partitions, is dependent on the I/O efficiency of cluster size, C. (Equation 1 on page 14.)
F is computed as (Favm M) /C
where Favm is the fraction of available hash memory. Generally, Favm = 0.8 is a good estimate. If S > M then the I/O cost for hash joins at the end of first iteration is given by:
Cost(HJ
ITERATION 1
That is, left and right inputs must be scanned, and, the partitions that do not fit in memory must be written to disk. Iteration 2 in the hash join algorithm involves processing Si and Bi partition pairs on disk using a nested hash loops join when the partitioned build input does not fit in memory. For each chunk of build input read, the probe input is read so this requires multiple reads of the probe input
Cost(HJ
ITERATION 2
) Read ( (S - M) + n (B - B * M / S) )
The dominant cost of a hash join is I/O on the probe input. If n is the number of steps required to complete iteration 2, then the I/O cost is approximately proportional to:
(S - M) + n (B - B * M / S)
Because the probe input may be reread more than once, n is used as a multiplier to calculate the re-reading cost. This multiplier n may be computed as the ratio of the size of each build partition (S / F) over the size of memory, M. Therefore:
n = (S / F) / M
When n is large typically, greater than 10 the hash join algorithm will incur a lot of I/O writing and re-reading the data in the partitions.
Page 1-13
For smaller values of n typically, less than 10 the hash-join should perform better than sort-merge join for normally distributed data. The value of n is inversely proportional to fanout, F. Increasing fanout will reduce n and when memory is fixed, fanout can be increased by reducing cluster size, C.
HASH_MULTIBLOCK_IO_COUNT 64 K
(1)
Generally, C 64 K because this is the limit set by most operating systems for the unit of I/O. Let Favm, the fraction of available memory usable from the hash area memory, be:
Favm = 0.8
We assume an overhead of 10 to 20 percent for storing partition tables, bitmap of unique left input S, and the hash table. A few buffers are also set aside for performing parallel query and asynchronous I/O. Let fan-out, the maximum number of partitions into which S can be split be:
Fan-out = ( Favm Mcritical ) / C
Hash join performance degrades when the size of each of the partitions from the build input are not small enough to fit in available memory. The degradation occurs because the probe input must be re-read multiple times to perform a nested hash loops join. (See equation 4.)
Page 1-14
When the size of each partition of build input, S AVAILABLE MEMORY, we get:
( S / Fanout ) Favm Mcritical ( S / ( Favm Mcritical )/ C ) AVAILABLE MEMORY ( S C ) Mcritical Mcritical Favm Favm (Mcritical) ( S C ) / (Favm)
2 2
(2)
Equation 2 is corroborated by our research findings. We observed that for any HASH_MULTIBLOCK_IO_COUNT, reducing hash area memory below Mcritical resulted in drastic performance degradation. A worst case scenario, where all rows are returned from B, was used. The test query used was as follows:
SELECT /*+ USE_HASH(B) ORDERED */ S.cola, S.colb, ,S.coln FROM S, B where S.cola = B.cola ;
Average row sizes of S and B were 100 bytes each. Variables in the research tests were as follows: Table sizes: S = 10 MB or S = 100 MB, and, B = 1000 MB.
HASH_AREA_SIZE =
m Mcritical h
HASH_MULTIBLOCK_IO_COUNT =
The size of S in Equation 3 was computed as the data going into the equi-join query; that is:
Cardinality(S) [ [ Average +[ Average +[ Average + +[ Average +[ Average Size(select list item 1 ] Size(select list item 2 ] Size(select list item 3 ] Size(select list item n ] size of join keys if not already in select list items]
Chart 1 on the next page shows some of our research results when HASH_MULTIBLOCK_IO_COUNT and HASH_AREA_SIZE were varied. The results are for S = 100 MB and B = 1000 MB. Table S contained 1 million rows and table B contained 10 million rows. The join column was defined as NUMBER(10) in both tables. Degradation in some test runtimes was so bad, that we were forced to terminate tests. (In the chart below, the data point at 16350 seconds, or 4.5 hours, was terminated before completion.)
Page 1-15
Chart 1: Performance of Hash Joins for different settings of Hash multiblock IO count and Mcritical= X = (sqrt (R * C) ) / FavmM, R=100 MB, S= 1 GB, Favm = 80%
16350 18000 16000 14000 11026 4973 4574 2984 1 Hash Multiblock IO count 2 4 1574 8 5x 1505 2x x 1914 1623 1683 3457 3698 2928 1965 4652 6000 4152 4000 2000 0 0.5x 8000 12000 10000 Elapsed Time (secs)
Chart 1 above shows that a HASH_AREA_SIZE below Mcritical causes a drastic degradation in performance. Generally, hash area sizes greater than Mcritical resulted in better performance. However, a hash area size greater than Mcritical does not necessarily provide better performance. At Mcritical, HASH_MULTIBLOCK_IO_COUNT begins to impact performance. (I/O versus fanout.) For example, HASH_AREA_SIZE = 5 Mcritical and HASH_MULTIBLOCK_IO_COUNT = 1 resulted in sub-optimal fanout calculation. Furthermore, in this scenario comparison of performance of hash joins across HASH_MULTIBLOCK_IO_COUNT for a given HASH_AREA_SIZE can be misleading because a lower HASH_MULTIBLOCK_IO_COUNT utilizes less hash area memory. The amount of hash memory, Mcritical, was calculated by substituting values of HASH_MULTIBLOCK_IO_COUNT in the Equation 2, where S was 10 MB or 100 MB, and, Favm = 0.8.
The above can be exploited when there is plenty of system memory, but a shortage of disk space for the temporary tablespace. The idea is to increase HASH_AREA_SIZE until the small table fits in the hash area memory. In such cases, assuming that there is enough inexpensive CPU power compared to disk I/O, then the cost would be as follows:
Cost(HJ) Read(S) + Read(B)
Page 1-16
HASH_MULTIBLOCK_IO_COUNT
Higher values of HASH_MULTIBLOCK_IO_COUNT make I/O efficient because large chunks of size cluster factor, C, can be read/written with a single I/O. This causes a smaller fan-out, F = Favm M / C. However, this means that there will be fewer large partitions and may not fit in memory!
HASH_MULTIBLOCK_IO_COUNT should set to 1 when the amount of data in the smaller table is orders of magnitude larger than available hash memory. This will result in the maximum number of partitions possible, each of which has a higher chance of fitting into memory.
Because Oracle performs I/O in 64 K chunks and DB_BLOCK_SIZE is fixed, HASH_MULTIBLOCK_IO_COUNT can be varied such that the following holds true:
DB_BLOCK_SIZE HASH_MULTIBLOCK_IO_COUNT 64 K
Using the above formula, HASH_MULTIBLOCK_IO_COUNT can be varied until the optimum value is found giving best performance for a query. If it is not possible to fine-tune individual queries, it is relatively safe to set HASH_MULTIBLOCK_IO_COUNT to a moderate value, such as 4 or 8, for DB_BLOCK_SIZE = 4K. An inappropriate hash function can lead to excessive collisions in the hash table, and, to partitions of uneven sizes. In [3] the worst case, all key values fall into a single partition and one complete iteration is wasted . It might also be the case that fan-out yields too many or too few partitions. In these cases, a different setting for HASH_MULTIBLOCK_IO_COUNT may help. Note that having skew in partitions is not necessarily bad, as long as the smallest of the two partition pairs is small enough relative to available memory. It does not matter which of the tables is smaller, because of dynamic role reversal. Furthermore, data skew may have little effect when a large number of rows are eliminated through bitvector filtering
Page 1-17
In p u t 1
In p u t 2
Page 1-18
TROUBLESHOOTING
The following tips should be useful in trouble shooting problems with hash joins. 1. 2. 3. 4. 5. 6. Make sure that the smaller table is the driving table (build input). Check that tables and/or columns of the join tables are appropriately analyzed. Histograms are recommended only for non-uniform column distributions. If needed, you can override the joinorder chosen by the cost based optimizer using the ORDERED hint. Check that hash area size, M, allocated for hash joins is at least Mcritical. (See Equation 2). To eliminate or minimize I/O to the temporary tablespace, try setting hash area size, M, such that M Mcritical and M = [ 1.6 SIZE(S) ]. For parallel hash joins, make sure there is no skew in the slave processes workloads by monitoring CPU usage, and, messages sent and received using V$PQSTAT. Monitoring these statistics over the elapsed time of the operation will show whether there is slave workload skew. Skews could also occur because there are very few distinct values in the column being equi-joined. The figure below illustrates two sets of slave processes in a parallel hash join of degree 8; table scanners are ora_p000 to ora_p007 and hashers are ora_p008 to ora_p015. These two slave sets have different CPU profiles and show approximately uniform work load among the slaves. This figure also illustrates the inter-operator and intra-operator parallelism used in Oracle Parallel Query architecture. The CPU profile was monitored using the ps Unix command over the elapsed time of the hash join query.
C P U U s a g e fo r P a r a lle l S la v e s V s E la p s e d T im e fo r 8 P a r a lle l H a s h J o in
25
20
15
ora_p002_HJ73
ora_p003_HJ73
ora_p004_HJ73
ora_p005_HJ73
ora_p006_HJ73
ora_p007_HJ73
ora_p008_HJ73
ora_p009_HJ73
ora_p010_HJ73
ora_p011_HJ73
ora_p012_HJ73
ora_p013_HJ73
ora_p014_HJ73
10
E la p s e d T i m e -- > >
ora_smon_HJ73
ora_pmon_HJ73
ora_dbwr_HJ73
ora_lgwr_HJ73
ora_p000_HJ73
0 1 3 5 7
ora_p001_HJ73
Page 1-19
APPENDICES
APPENDIX A Citations
[1] Oracle7 Server Documentation, Addendum, Release 7.1, Part No: A12042-3. [2] Oracle Release7.3 Server Addendum, and, Oracle Release 7.3 Readme File [3] A Performance Evaluation of Histogram-Driven Recursive Hybrid Hash Join, Ph.D dissertation, Graefe Goetz, Portland State University. [4] Sort-Merge Join: An Idea Whose Time Has(H) Passed?, IEEE International Conference on Data Engineering., February 1994, Graefe Goetz. [5] September 1996 to January 1997 Interviews: Cetin Ozbutun, Sameer Patkar, Linda Willis, Gary Hallmark. [6] Query Processing in Parallel Relational Database Systems, section Parallel Processing of Joins, IEEE Computer Society Press, 199?, Hongjun Lu, et al. [7] Tuning Large Sorts, Technical Reports Compendium, Volume I, 1996, Center of Expertise, Oracle Worldwide Customer Support. [8] Oracle confidential development documents, by Cetin Ozbutin.
A hash join can be forced using a hint as shown in the following example:
SELECT /*+ USE_HASH(S) */ S.a, B.a , ... FROM S,B WHERE S.a = B.a ;
Unlike SORT_AREA_SIZE, which requires re-starting the database, dynamic hash join parameters allow individual queries to be tuned per session. Dynamic parameters affecting hash joins are: HASH_MULTIBLOCK_IO_COUNT and HASH_AREA_SIZE.
HASH_AREA_SIZE determines maximum amount of memory to be used for hash join. HASH_MULTIBLOCK_IO_COUNT determines how many blocks should be read and written at a time to temporary space. This parameter is similar in functionality to DB_MULTIBLOCK_IO_COUNT.
Because HASH_JOIN_ENABLED=TRUE by default, customers might want to look out for queries whose access plans might have changed after upgrading to Release 7.3. Alternatively, disable this feature and selectively turn it on per session by using the ALTER SESSION command, but only enable the feature in production if you feel the Hash Joins feature performs adequately well in your setting.
Page 1-20