Bitmap Indexes: Table 11-6. A Representation of How Oracle Would Store The Bitmp Index
Bitmap Indexes: Table 11-6. A Representation of How Oracle Would Store The Bitmp Index
Bitmap indexes were added to Oracle in version 7.3 of the database. They are currently
available with the Oracle Enterprise and Personal Editions, but not the Standard Edition.
Bitmap indexes are designed for data warehousing/ad hoc query environments where the full
set of queries that may be asked of the data is not totally known at system implementation
time. They are specifically not designed for OLTP systems or systems where data is
frequently updated by many concurrent sessions.
Bitmap indexes are structures that store pointers to many rows with a single index key
entry, as compared to a B*Tree structure where there is parity between the index keys and the
rows in a table. In a bitmap index, there will be a very small number of index entries, each of
which points to many rows. In a conventional B*Tree, one index entry points to a single row.
Let’s say we are creating a bitmap index on the JOB column in the EMP table as follows:
Ops$tkyte@ORA10G> create BITMAP index job_idx on emp(job);
Index created.
Oracle will store something like what is shown in Table 11-6 in the index.
Table 11-6. A representation of how Oracle would store the JOB-IDX bitmp index.
Value/Row 1 2 3 4 5 6 7 8 9 10
11 12 13 14
ANALYST 0 0 0 0 0 0 0 1 0 1
0 0 1 0
CLERK 1 0 0 0 0 0 0 0 0 0 1
1 0 1
MANAGER 0 0 0 1 0 1 1 0 0 0
0 0 0 0
PRESIDENT 0 0 0 0 0 0 0 0 1 0
0 0 0 0
SALESMAN 0 1 1 0 1 0 0 0 0 0
0 0 0 0
Table 11-6 shows that rows 8, 10, and 13 have the value ANALYST, whereas rows 4, 6,
and 7 have the value MANAGER. It also shows us that no rows are null (bitmap indexes store
null entries; the lack of a null entry in the index implies there are no null rows). If we wanted
to count the rows that have the value MANAGER, the bitmap index would do this very
rapidly. If we wanted to find all the rows such that the JOB was CLERK or MANAGER, we
could simply combine their bitmaps from the index as, shown in Table 11-7.
Table 11-7 rapidly shows us that rows 1, 4, 6, 7, 11, 12, and 14 satisfy our criteria. The
bitmap Oracle stores with each key value is set up so that each position represents a rowid in
the underlying table, if we need to actually retrieve the row for further processing. Queries
such as the following:
select count(*) from emp where job = 'CLERK' or job = 'MANAGER'
will be answered directly from the bitmap index. A query such as this:
select * from emp where job = 'CLERK' or job = 'MANAGER'
on the other hand, will need to get to the table. Here, Oracle will apply a function to turn the
fact that the i’th bit is on in a bitmap, into a rowid that can be used to access the table.
select *
from t
where ( ( gender = 'M' and location = 20 )
or ( gender = 'F' and location = 22 ))
and age_group = '18 and under';
select count(*) from t where age_group = '41 and over' and gender = 'F';
You would find that a conventional B*Tree indexing scheme would fail you. If you
wanted to use an index to get the answer, you would need at least three and up to six
combinations of possible B*Tree indexes to access the data via the index. Since any of the
three columns or any subset of the three columns may appear, you would need large
concatenated B*Tree indexes on
* GENDER, LOCATION, AGE_GROUP: For queries that used all three, or GENDER with
LOCATION, or GENDER alone
* LOCATION, AGE_GROUP: For queries that used LOCATION and AGE_GROUP or
LOCATION alone
* AGE_GROUP, GENDER: For queries that used AGE_GROUP with GENDER or
AGE_GROUP alone
To reduce the amount of data being searched, other permutations might be reasonable as
well, to decrease the size of the index structure being scanned. This is ignoring the fact that a
B*Tree index on such low cardinality data is not a good idea.
Here the bitmap index comes into play. With three small bitmap indexes, one on each of
the individual columns, you will be able to satisfy all of the previous predicates efficiently.
Oracle will simply use the functions AND, OR, and NOT, with the bitmaps of the three
indexes together, to find the solution set for any predicate that references any set of these
three columns. It will take the resulting merged bitmap, convert the 1s into rowids if
necessary, and access the data (if you are just counting rows that match the criteria, Oracle
will just count the 1 bits). Let’s take a look at an example. First, we’ll generate test data that
matches our specified distinct cardinalities—index it and gather statistics. We’ll make use of
the DBMS_RANDOM package to generate random data fitting our distribution:
ops$tkyte@ORA10G> create table t
2 ( gender not null,
3 location not null,
4 age_group not null,
5 data
6 )
7 as
8 select decode( ceil(dbms_random.value(1,2)),
9 1, 'M',
10 2, 'F' ) gender,
11 ceil(dbms_random.value(1,50)) location,
12 decode( ceil(dbms_random.value(1,5)),
13 1,'18 and under',
14 2,'19-25',
15 3,'26-30',
16 4,'31-40',
17 5,'41 and over'),
18 rpad( '*', 20, '*')
19 from big_table.big_table
20 where rownum <= 100000;
Table created.
Now we’ll take a look at the plans for our various ad hoc queries from earlier:
ops$tkyte@ORA10G> Select count(*)
2 from T
3 where gender = 'M'
4 and location in ( 1, 10, 30 )
5 and age_group = '41 and over';
Execution Plan
----------------------------------------------------------
0 SELECT STATEMENT Optimizer=ALL_ROWS (Cost=5 Card=1 Bytes=13)
1 0 SORT (AGGREGATE)
2 1 BITMAP CONVERSION (COUNT) (Cost=5 Card=1 Bytes=13)
3 2 BITMAP AND
4 3 BITMAP INDEX (SINGLE VALUE) OF 'GENDER_IDX' (INDEX (BITMAP))
5 3 BITMAP OR
6 5 BITMAP INDEX (SINGLE VALUE) OF 'LOCATION_IDX' (INDEX (BITMAP))
7 5 BITMAP INDEX (SINGLE VALUE) OF 'LOCATION_IDX' (INDEX (BITMAP))
8 5 BITMAP INDEX (SINGLE VALUE) OF 'LOCATION_IDX' (INDEX (BITMAP))
9 3 BITMAP INDEX (SINGLE VALUE) OF 'AGE_GROUP_IDX' (INDEX (BITMAP))
This example shows the power of the bitmap indexes. Oracle is able to see the location in
(1,10,30) and knows to read the index on location for these three values and logically OR
together the “bits” in the bitmap. It then takes that resulting bitmap and logically ANDs that
with the bitmaps for AGE_GROUP='41 AND OVER' and GENDER='M'. Then a simple count of
1s and the answer is ready.
ops$tkyte@ORA10G> select *
2 from t
3 where ( ( gender = 'M' and location = 20 )
4 or ( gender = 'F' and location = 22 ))
5 and age_group = '18 and under';
Execution Plan
----------------------------------------------------------
0 SELECT STATEMENT Optimizer=ALL_ROWS (Cost=77 Card=507 Bytes=16731)
1 0 TABLE ACCESS (BY INDEX ROWID) OF 'T' (TABLE) (Cost=77 Card=507 …
2 1 BITMAP CONVERSION (TO ROWIDS)
3 2 BITMAP AND
4 3 BITMAP INDEX (SINGLE VALUE) OF 'AGE_GROUP_IDX' (INDEX (BITMAP))
5 3 BITMAP OR
6 5 BITMAP AND
7 6 BITMAP INDEX (SINGLE VALUE) OF 'LOCATION_IDX' (INDEX (BITMAP))
8 6 BITMAP INDEX (SINGLE VALUE) OF 'GENDER_IDX' (INDEX (BITMAP))
9 5 BITMAP AND
10 9 BITMAP INDEX (SINGLE VALUE) OF 'GENDER_IDX' (INDEX (BITMAP))
11 9 BITMAP INDEX (SINGLE VALUE) OF 'LOCATION_IDX' (INDEX (BITMAP))
This shows similar logic: the plan shows the OR’d conditions are each evaluated by AND-
ing together the appropriate bitmaps and then OR-ing together those results. Throw in another
AND to satisfy the AGE_GROUP='18 AND UNDER' and we have it all. Since we asked for the
actual rows this time, Oracle will convert each bitmap 1 and 0 into rowids to retrieve the
source data.
In a data warehouse or a large reporting system supporting many ad hoc SQL queries,
this ability to use as many indexes as make sense simultaneously comes in very handy
indeed. Using conventional B*Tree indexes here would not be nearly as usual or usable, and
as the number of columns that are to be searched by the ad hoc queries increases, the number
of combinations of B*Tree indexes you would need increases as well.
However, there are times when bitmaps are not appropriate. They work well in a read-
intensive environment, but they are extremely ill suited for a write-intensive environment.
The reason is that a single bitmap index key entry points to many rows. If a session modifies
the indexed data, then all of the rows that index entry points to are effectively locked in most
cases. Oracle cannot lock an individual bit in a bitmap index entry; it locks the entire bitmap
index entry. Any other modifications that need to update that same bitmap index entry will be
locked out. This will seriously inhibit concurrency, as each update will appear to lock
potentially hundreds of rows preventing their bitmap columns from being concurrently
updated. It will not lock every row as you might think—just many of them. Bitmaps are
stored in chunks, so using the earlier EMP example we might find that the index key
ANALYST appears in the index many times, each time pointing to hundreds of rows. An
update to a row that modifies the JOB column will need to get exclusive access to two of
these index key entries: the index key entry for the old value and the index key entry for the
new value. The hundreds of rows these two entries point to will be unavailable for
modification by other sessions until that UPDATE commits.
Those queries almost necessarily have to access the DEPT table and the EMP table using
conventional indexes. We might use an index on DEPT.DNAME to find the SALES row(s) and
retrieve the DEPTNO value for SALES, and then using an INDEX on EMP.DEPTNO find the
matching rows, but by using a bitmap join index we can avoid all of that. The bitmap join
index allows us to index the DEPT.DNAME column, but have that index point not at the DEPT
table, but at the EMP table. This is a pretty radical concept—to be able to index attributes
from other tables—and it might change the way to implement your data model in a reporting
system. You can, in effect, have your cake and eat it, too. You can keep your normalized data
structures intact, yet get the benefits of denormalization at the same time.
Here’s the index we would create for this example:
ops$tkyte@ORA10G> create bitmap index emp_bm_idx
2 on emp( d.dname )
3 from emp e, dept d
4 where e.deptno = d.deptno
5 /
Index created.
Note how the beginning of the CREATE INDEX looks “normal” and creates the index
INDEX_NAME on the table. But from there on, it deviates from “normal.” We see a reference
to a column in the DEPT table: D.DNAME. We see a FROM clause, making this CREATE
INDEX statement resemble a query. We have a join condition between multiple tables. This
CREATE INDEX statement indexes the DEPT.DNAME column, but in the context of the EMP
table. If we ask those questions mentioned earlier, we would find the database never accesses
the DEPT at all, and it need not do so because the DNAME column now exists in the index
pointing to rows in the EMP table. For purposes of illustration, we will make the EMP and
DEPT tables appear “large” (to avoid having the CBO think they are small and full scanning
them instead of using indexes):
ops$tkyte@ORA10G> begin
2 dbms_stats.set_table_stats( user, 'EMP',
3 numrows => 1000000, numblks => 300000 );
4 dbms_stats.set_table_stats( user, 'DEPT',
5 numrows => 100000, numblks => 30000 );
6 end;
7/
PL/SQL procedure successfully completed.
As you can see, to answer this particular question, we did not have to actually access
either the EMP or DEPT table—the entire answer came from the index itself. All the
information needed to answer the question was available in the index structure.
Further, we were able to skip accessing the DEPT table and, using the index on EMP that
incorporated the data we needed from DEPT, gain direct access to the required rows:
ops$tkyte@ORA10G> select emp.*
2 from emp, dept
3 where emp.deptno = dept.deptno
4 and dept.dname = 'SALES'
5 /
Execution Plan
----------------------------------------------------------
0 SELECT STATEMENT Optimizer=ALL_ROWS (Cost=6145 Card=10000 Bytes=870000)
1 0 TABLE ACCESS (BY INDEX ROWID) OF 'EMP' (TABLE) (Cost=6145 Card=10000 …
2 1 BITMAP CONVERSION (TO ROWIDS)
3 2 BITMAP INDEX (SINGLE VALUE) OF 'EMP_BM_IDX' (INDEX (BITMAP))
Bitmap join indexes do have a prerequisite. The join condition must join to a primary or
unique key in the other table. In the preceding example, DEPT.DEPTNO is the primary key of
the DEPT table, and the primary key must be in place, otherwise an error will occur:
ops$tkyte@ORA10G> create bitmap index emp_bm_idx
2 on emp( d.dname )
3 from emp e, dept d
4 where e.deptno = d.deptno
5 /
from emp e, dept d
*
ERROR at line 3:
ORA-25954: missing primary key or unique constraint on dimension