Representing Trees in Oracle SQL
Representing Trees in Oracle SQL
8 rows selected.
The ntegers n the supervisor_id are actually po nters to other rows n the
corporate_slaves table. Need to d splay an org chart? W th only standard SQL
ava lable, you'd wr te a program n the cl ent language (e.g., C, L sp, Perl, or Tcl) to
do the follow ng:
W th the Oracle CONNECT BY clause, you can get all the rows out at once:
VP Marketing 2 1
VP Sales 3 1
Joe Sales Guy 4 3
Bill Sales Assistant 5 4
VP Engineering 6 1
Jane Nerd 7 6
Bob Nerd 8 6
Jane Nerd 7 6
Bob Nerd 8 6
20 rows selected.
Th s seems a l ttle strange. It looks as though Oracle has produced all poss ble trees and subtrees. Let's
add a START WITH clause:
8 rows selected.
Not ce that we've used a subquery n the START WITH clause to f nd out who
s/are the b g kahuna(s). For the rest of th s example, we'll just hard-code n the
slave_id 1 for brev ty.
Though these folks are n the correct order, t s k nd of tough to tell from the
preced ng report who works for whom. Oracle prov des a mag c pseudo-column
that s mean ngful only when a query ncludes a CONNECT BY. The pseudo-
column s level:
8 rows selected.
The level column can be used for ndentat on. Here we w ll use the concatenat on operator (||) to add
spaces n front of the name column:
select
lpad(' ', (level - 1) * 2) || name as padded_name,
slave_id,
supervisor_id,
level
from corporate_slaves
connect by prior slave_id = supervisor_id
start with slave_id = 1;
8 rows selected.
If you want to l m t your report, you can use standard WHERE clauses:
select
lpad(' ', (level - 1) * 2) || name as padded_name,
slave_id,
supervisor_id,
level
from corporate_slaves
where level <= 3
connect by prior slave_id = supervisor_id
start with slave_id = 1;
7 rows selected.
Suppose that you want people at the same level to sort alphabet cally. Sadly, the ORDER BY clause
doesn't work so great n conjunct on w th CONNECT BY:
select
lpad(' ', (level - 1) * 2) || name as padded_name,
slave_id,
supervisor_id,
level
from corporate_slaves
connect by prior slave_id = supervisor_id
start with slave_id = 1
order by level, name;
select
lpad(' ', (level - 1) * 2) || name as padded_name,
slave_id,
supervisor_id,
level
from corporate_slaves
connect by prior slave_id = supervisor_id
start with slave_id = 1
order by name;
SQL s a set-or ented language. In the result of a CONNECT BY query, t s prec sely the order that has
value. Thus t doesn't make much sense to also have an ORDER BY clause.
select
lpad(' ', (level - 1) * 2) || cs1.name as padded_name,
cs2.name as supervisor_name
from corporate_slaves cs1, corporate_slaves cs2
where cs1.supervisor_id = cs2.slave_id(+)
connect by prior cs1.slave_id = cs1.supervisor_id
start with cs1.slave_id = 1;
ERROR at line 4:
ORA-01437: cannot have join with CONNECT BY
Not ce that we've had to rename level so that we d dn't end up w th a v ew column named after a reserved
word. The v ew works just l ke the raw query:
8 rows selected.
PADDED_NAME SUPERVISOR_NAME
------------------------------ --------------------
Big Boss Man
VP Marketing Big Boss Man
VP Sales Big Boss Man
Joe Sales Guy VP Sales
https://ph l p.greenspun.com/sql/trees.html 5/16
19.09.2019 Represent ng Trees n Oracle SQL
Bill Sales Assistant Joe Sales Guy
VP Engineering Big Boss Man
Jane Nerd VP Engineering
Bob Nerd VP Engineering
8 rows selected.
If you have sharp eyes, you'll not ce that we've actually OUTER JOINed so that our results don't exclude
the b g boss.
select
lpad(' ', (level - 1) * 2) || name as padded_name,
(select name
from corporate_slaves cs2
where cs2.slave_id = cs1.supervisor_id) as supervisor_name
from corporate_slaves cs1
connect by prior slave_id = supervisor_id
start with slave_id = 1;
PADDED_NAME SUPERVISOR_NAME
------------------------------ --------------------
Big Boss Man
VP Marketing Big Boss Man
VP Sales Big Boss Man
Joe Sales Guy VP Sales
Bill Sales Assistant Joe Sales Guy
VP Engineering Big Boss Man
Jane Nerd VP Engineering
Bob Nerd VP Engineering
8 rows selected.
The general rule n Oracle s that you can have a subquery that returns a s ngle row anywhere n the select
l st.
Suppose that you've bu lt an ntranet Web serv ce. There are th ngs that your software should show to an
employee's boss (or boss's boss) that t shouldn't show to a subord nate or peer. Here we try to f gure out f
the VP Market ng (#2) has superv sory author ty over Jane Nerd (#7):
select count(*)
from corporate_slaves
where slave_id = 7
and level > 1
start with slave_id = 2
connect by prior slave_id = supervisor_id;
COUNT(*)
----------
0
Apparently not. Not ce that we start w th the VP Market ng (#2) and st pulate level > 1 to be sure that we
w ll never conclude that someone superv ses h m or herself. Let's ask f the B g Boss Man (#1) has
author ty over Jane Nerd:
select count(*)
from corporate_slaves
where slave_id = 7
https://ph l p.greenspun.com/sql/trees.html 6/16
19.09.2019 Represent ng Trees n Oracle SQL
and level > 1
start with slave_id = 1
connect by prior slave_id = supervisor_id;
COUNT(*)
----------
1
Even though B g Boss Man sn't Jane Nerd's d rect superv sor, ask ng Oracle to compute the relevant
subtree y elds us the correct result. In the ArsD g ta Commun ty System Intranet module, we dec ded that
th s computat on was too mportant to be left as a query n nd v dual Web pages. We central zed the
quest on n a PL/SQL procedure:
Fam ly trees
What f the graph s a l ttle more compl cated than employee-superv sor? For example, suppose that you
are represent ng a fam ly tree. Even w thout allow ng for d vorce and remarr age, exot c South Afr can
fert l ty cl n cs, etc., we st ll need more than one po nter for each node:
update family_relatives
set spouse = 2
where relative_id = 1;
update family_relatives
set spouse = 6
where relative_id = 5;
update family_relatives
set spouse = 7
where relative_id = 3;
In apply ng the lessons from the employee examples, the most obv ous problem that we face now s
whether to follow the mother or the father po nters:
FULL_NAME
-------------------------
Nick Gittes
Regina Gittes
https://ph l p.greenspun.com/sql/trees.html 8/16
19.09.2019 Represent ng Trees n Oracle SQL
Marjorie Gittes
FULL_NAME
-------------------------
Cecile Kaplan
Regina Gittes
Suzanne Greenspun
Philip Greenspun
Harry Greenspun
Marjorie Gittes
Here's what the off c al Oracle docs have to say about CONNECT BY:
spec f es the relat onsh p between parent rows and ch ld rows of the h erarchy. cond t on can
be any cond t on as descr bed n "Cond t ons". However, some part of the cond t on must use
the PRIOR operator to refer to the parent row. The part of the cond t on conta n ng the
PRIOR operator must have one of the follow ng forms:
PRIOR expr comparison_operator expr
expr comparison_operator PRIOR expr
There s noth ng that says comparison_operator has to be merely the equals s gn. Let's start aga n w th my
mom's father but CONNECT BY more than one column:
-- follow both
select lpad(' ', (level - 1) * 2) || first_names || ' ' || last_name as full_name
from family_relatives
connect by prior relative_id in (mother, father)
start with relative_id = 1;
FULL_NAME
-------------------------
Nick Gittes
Regina Gittes
Suzanne Greenspun
Philip Greenspun
Harry Greenspun
Marjorie Gittes
Instead of arb trar ly start ng w th Grandpa N ck, let's ask Oracle to show us all the trees that start w th a
person whose parents are unknown:
FULL_NAME
-------------------------
Nick Gittes
Regina Gittes
Suzanne Greenspun
Philip Greenspun
Harry Greenspun
Marjorie Gittes
Cecile Kaplan
Regina Gittes
Suzanne Greenspun
22 rows selected.
select
lpad(' ', (level - 1) * 2) || first_names || ' ' || last_name as full_name,
family_spouse_name(relative_id) as spouse
from family_relatives
connect by prior relative_id in (mother, father)
start with relative_id in (select relative_id from family_relatives
where mother is null
and father is null);
FULL_NAME SPOUSE
------------------------- --------------------
Nick Gittes Cecile Kaplan
Regina Gittes Nathaniel Greenspun
Suzanne Greenspun
Philip Greenspun
Harry Greenspun
Marjorie Gittes
Cecile Kaplan Nick Gittes
Regina Gittes Nathaniel Greenspun
Suzanne Greenspun
To show the number of stor es alongs de a fam ly member's l st ng, we would typ cally do an OUTER
JOIN and then GROUP BY the columns other than the count(family_story_id). In order not to d sturb
the CONNECT BY, however, we create another PL/SQL funct on:
select
lpad(' ', (level - 1) * 2) || first_names || ' ' || last_name as full_name,
family_n_stories(relative_id) as n_stories
from family_relatives
connect by prior relative_id in (mother, father)
start with relative_id in (select relative_id from family_relatives
where mother is null
and father is null);
FULL_NAME N_STORIES
------------------------- ----------
Nick Gittes 0
...
Shirley Greenspun 0
Nathaniel Greenspun 1
Suzanne Greenspun 1
Philip Greenspun 1
Harry Greenspun 0
...
Work ng Backwards
What does t look l ke to start at the youngest generat on and work back?
select
lpad(' ', (level - 1) * 2) || first_names || ' ' || last_name as full_name,
family_spouse_name(relative_id) as spouse
from family_relatives
connect by relative_id in (prior mother, prior father)
start with relative_id = 9;
FULL_NAME SPOUSE
------------------------- --------------------
Philip Greenspun
Regina Gittes Nathaniel Greenspun
Nick Gittes Cecile Kaplan
Cecile Kaplan Nick Gittes
Nathaniel Greenspun Regina Gittes
Shirley Greenspun Jack Greenspun
Jack Greenspun Shirley Greenspun
We ought to be able to v ew all the trees start ng from all the leaves but Oracle seems to be exh b t ng
strange behav or:
select
lpad(' ', (level - 1) * 2) || first_names || ' ' || last_name as full_name,
https://ph l p.greenspun.com/sql/trees.html 12/16
19.09.2019 Represent ng Trees n Oracle SQL
family_spouse_name(relative_id) as spouse
from family_relatives
connect by relative_id in (prior mother, prior father)
start with relative_id not in (select mother from family_relatives
union
select father from family_relatives);
no rows selected
What's wrong? If we try the subquery by tself, we get a reasonable result. Here are all the relative_ids
that appear n the mother or father column at least once.
MOTHER
----------
1
2
3
5
6
7
7 rows selected.
The answer l es n that extra blank l ne at the bottom. There s a NULL n th s result set. Exper mentat on
reveals that Oracle behaves asymmetr cally w th NULLs and IN and NOT IN:
D
-
X
no rows selected
The answer s bur ed n the Oracle documentat on of NOT IN: "Evaluates to FALSE f any member of the
set s NULL." The correct query n th s case?
select
lpad(' ', (level - 1) * 2) || first_names || ' ' || last_name as full_name,
family_spouse_name(relative_id) as spouse
from family_relatives
connect by relative_id in (prior mother, prior father)
start with relative_id not in (select mother
from family_relatives
where mother is not null
union
select father
from family_relatives
where father is not null);
FULL_NAME SPOUSE
------------------------- --------------------
Marjorie Gittes
Nick Gittes Cecile Kaplan
Cecile Kaplan Nick Gittes
Suzanne Greenspun
Regina Gittes Nathaniel Greenspun
Nick Gittes Cecile Kaplan
Cecile Kaplan Nick Gittes
https://ph l p.greenspun.com/sql/trees.html 13/16
19.09.2019 Represent ng Trees n Oracle SQL
Nathaniel Greenspun Regina Gittes
Shirley Greenspun Jack Greenspun
Jack Greenspun Shirley Greenspun
Philip Greenspun
Regina Gittes Nathaniel Greenspun
Nick Gittes Cecile Kaplan
Cecile Kaplan Nick Gittes
Nathaniel Greenspun Regina Gittes
Shirley Greenspun Jack Greenspun
Jack Greenspun Shirley Greenspun
Harry Greenspun
Regina Gittes Nathaniel Greenspun
Nick Gittes Cecile Kaplan
Cecile Kaplan Nick Gittes
Nathaniel Greenspun Regina Gittes
Shirley Greenspun Jack Greenspun
Jack Greenspun Shirley Greenspun
24 rows selected.
Reference
Next: dates
ph lg@m t.edu
Reader's Comments
Oracle9 does CONNECT BY on jo ns. It also adds an "ORDER SIBLINGS BY" clause,
f x ng the om ss on that prevents you from order ng each level of the query.
Couldn't f nd the art cle at Dartmouth :(, t looked really nterest ng!
Interested readers should check out Joe Celko's nested set model for represent ng trees n
SQL. No need to be locked nto propr etary SQL d alects and probably a couple of orders of
magn tude faster to query!
+ http://www.dbmsmag.com/9606d06.html
+ http://www.sqlteam.com/Forums/top c.asp?TOPIC_ID=14099
+ http://www.dbaz ne.com/oracle/or-art cles/tropashko4
+ http://mrnaz.com/stat c/art cles/trees_ n_sql_tutor al/mptt_overv ew.php
Regards, Mattster
Add a comment
Related L nks
represent ng an m-ary tree n sql- Th s method allows for very fast retr eval of descendants and
mod f cat on of an m-ary tree. no self-referenc ng or nested select statements are necessary to
retr eve some or all descendants. the labell ng of nodes s such that t allows very s mple and fast
query ng for DFS order of nodes. t was part ally nsp red by huffman encod ng. (contr buted by
Anthony D'Aur a)
Dead l nk- The l nk above to Dartmouth college appears to be dead, but Web Arch ve kept a copy
of the page (contr buted by Tom Lebr)
Add a l nk