0% found this document useful (0 votes)
65 views7 pages

Intro To Histograms. Three Basic Questions Answered. - DBA Parad

This document provides an introduction to histograms in 3 parts: 1) Histograms are additional column statistics that sort column values into buckets to show the true data distribution, rather than assuming a uniform distribution. 2) We need histograms because assuming a uniform distribution when data is actually skewed can lead to inaccurate cardinality estimates and suboptimal execution plans. 3) Histograms are automatically created during statistics gathering if the database determines they are needed based on previous query predicates in SYS.COL_USAGE$.

Uploaded by

kruemeL1969
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
65 views7 pages

Intro To Histograms. Three Basic Questions Answered. - DBA Parad

This document provides an introduction to histograms in 3 parts: 1) Histograms are additional column statistics that sort column values into buckets to show the true data distribution, rather than assuming a uniform distribution. 2) We need histograms because assuming a uniform distribution when data is actually skewed can lead to inaccurate cardinality estimates and suboptimal execution plans. 3) Histograms are automatically created during statistics gathering if the database determines they are needed based on previous query predicates in SYS.COL_USAGE$.

Uploaded by

kruemeL1969
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 7

DBA PARADISE

The place where DBAs grow

HOME DATABASE MY COURSES PERSONAL GROWTH ABOUT CONTACT


Intro To Histograms. Three Basic Questions
Answered.
June 6, 2018 Off  By DIANAROBETE

As a DBA, one day or another you will come across histograms.


Either you will need to implement/delete/gather histograms to fix a performance
problem,
or
you will need to explain someone what histograms are,
or
you might even be part of a conversation about histograms.
Either way, you will need a basic understanding of histograms.
And as a good DBA, you want to feel valuable and useful, so it is a good idea to
sharpen up your skills and learn about histograms.
As always, I’ll try to break down the subject into small, bite size, easy to
understand pieces.
1.What Are Histograms?
2.Why Do We Need Histograms?
3.When Are Histograms Created?

1.What Are Histograms?


Histograms are additional, “special” column statistics.
These statistics have information about the data distribution in the column.
The values of the column that has histograms are sorted into buckets. These
buckets are statistic buckets, as the data itself on the disk is not sorted.
Based on the number of distinct values in the column, the database will pick one
of the four types of histograms (as of 12c and up):
-Frequency Histograms
-Top Frequency Histograms
-Height-Balanced Histograms (legacy)
-Hybrid Histograms
These histogram types, will be discussed in a future blog post.

2.Why Do We Need Histograms?


Without histograms, the optimizer assumes the data is distributed uniformly in
the column. How would that look like?
Lets take for example table A that has 100 rows, with 4 distinct values for the
column color:
red, blue, green, yellow.
Without histograms, the optimizer assumes that there are 25 rows with color red,
25 rows with color blue, 25 rows with color green, 25 rows with color yellow.
Why is this a bad assumption?
What if there is only one row with color red, 1 row with color blue, and 1 row with
color green, and then there are 97 rows of color yellow.
In this case the data distribution is not uniform. We can say that the data is
skewed. If the data is skewed, then the optimizer might (and most likely will)
generate inaccurate cardinality estimates, which can lead to sub-optimal
execution plans.
Histograms could make the difference in the optimizer picking an execution plan
with a full table scan or an execution plan with an index scan.

3.When Are Histograms Created?


If you gather stats on a table using dbms_stats, with the METHOD_OPT set to SIZE
AUTO (which is the default), then the database will create histograms
automatically if needed.
How does the database know that histograms are needed?
After you gathered stats on the table, and ran some select statements, the
following dictionary table: SYS.COL_USAGE$ is updated, with information about
previous predicates used in queries.
Then you gather stats again. Now the database will check SYS.COL_USAGE$, to
see which columns will need histograms. If needed, it will gather histograms.
If you enjoyed this article, and would like to learn more about databases, please
sign up below, and you will receive
The Ultimate 3 Step Guide To Find The Root Cause Of The Slow Running SQL!
–Diana

Start Solving Database


Performance ChallengesToday

 Don't know where to start when fighting SQL Performance


Challenges? Enter your e-mail below for FREE access to
my Ultimate 3 Step Performance Troubleshooting Guide.

Name: Your Name

Email: Your Best Email Address

YES! Sign Me Up!

We respect your email privacy

Powered by AWeber Email Marketing

Category Database Oracle Oracle

Tags Column Stats histograms Oracle

4 Comments
Sithembele says:
June 6, 2018 at 10:15 pm

Awesome thanks.

Which Histogram Will Oracle Pick? – DBA Paradise says:


June 13, 2018 at 11:20 pm

[…] If you need an introduction to histograms, you can check out last week’s post:
Intro To Histograms. Three Basic Questions Answered. […]

Arsalan says:
August 16, 2018 at 7:26 am

BIGGGGGG Like!!!

dianarobete says:
August 22, 2018 at 10:28 pm

thank you!

Comments are closed.

Proudly powered by WordPress | Theme: Balanced Blog

You might also like