Index Skip Scan
Index Skip Scan
The row set can be a base table, a view, or the result of a join or a GROUP BY operator. The selectivity is tied to a query predicate, such as last_name = 'Smith', or a combination of predicates, such as last_name = 'Smith' AND job_type = 'Clerk'. A predicate filters a specific number of rows from a row set. Thus, the selectivity of a predicate indicates how many rows pass the predicate test. Selectivity ranges from 0.0 to 1.0. A selectivity of 0.0 means that no rows are selected from a row set, whereas a selectivity of 1.0 means that all rows are selected. A predicate becomes more selective as the value approaches 0.0 and less selective (or more unselective) as the value approaches 1.0.
Composite Indexes
A composite index contains multiple key columns. Composite indexes can provide additional advantages over single-column indexes: Improved selectivity - Sometimes you can combine two or more columns or expressions, each with poor selectivity, to form a composite index with higher selectivity. It reduces the number of indexes needed to support a range of queries. This increases performance by reducing index maintenance and decreases wasted space associated with multiple indexes.
A SQL statement can use an access path involving a composite index when the statement contains constructs that use a leading portion of the index. A leading portion of an index is a set of one or more columns that were specified first and consecutively in the list of columns in the CREATE INDEX statement that created the index. Consider this CREATE INDEX statement:
CREATE INDEX comp_index ON table1(x, y, z);
x, xy, and xyz combinations of columns are leading portions of the index yz, y, and z combinations of columns are not leading portions of the index
The index skip scan was introduced to allow Oracle to skip leading-edge predicates in a multi-column index. You can force an index skip scan with the /*+ index_ss */ hint. For example, consider the following concatenated index on a super-low cardinality column, following by a very selective column:
The customers table has a column cust_gender whose values are either M or F. Assume that a composite index exists on the columns (cust_gender, cust_email) that was created as follows:
CREATE INDEX customers_gender_email ON sh.customers (cust_gender, cust_email);
The database can use a skip scan of this index even though cust_gender is not specified in the WHERE clause. In a skip scan, the number of logical subindexes is determined by the number of distinct values in the leading column. In Example 116, the leading column has two possible values. The database logically splits the index into one subindex with the key F and a second subindex with the key M. When searching for the record for the customer whose email is [email protected], the database searches the subindex with the value F first and then searches the subindex with the value M. Conceptually, the database processes the query as follows:
SELECT * FROM sh.customers WHERE cust_gender = 'F' AND cust_email = '[email protected]' UNION ALL SELECT * FROM sh.customers WHERE cust_gender = 'M' AND cust_email = '[email protected]';
How It Works
Rather than restricting the search path using a predicate from the statement, Skip Scans are initiated by probing the index for distinct values of the prefix column. Each of these distinct values is then used as a starting point for a regular index search. The result is several separate searches of a single index that, when combined, eliminate the affect of the prefix column. Essentially, the index has been searched from the second level down. The optimizer uses statistics to decide if a skip scan would be more efficient than a full table scan.
The prefix column should be the most discriminating and the most widely used in queries. These two conditions do not always go hand in hand which makes the decision difficult. In these situations skip scanning reduces the impact of makeing the "wrong" decision.