GG 400
GG 400
GG 400
Abstract
Grid is one of the first data structure introduced at the very beginning of computer graphics. Grids are used
in several applications of computer graphics, especially in rendering algorithms. Lately, in ray tracing dynamic
scenes, grid has received attention for its appealing linear time building time. In this paper, we aim to survey
several aspects behind the use of grids in ray tracing. In particular we investigate grid traversal algorithms,
building techniques and several approaches for hierarchical grids.
Categories and Subject Descriptors (according to ACM CCS): I.3.6 [Computer Graphics]: Methodology and Tech-
niques: Graphics data structures and data types I.3.7 [Computer Graphics]: Three-Dimensional Graphics and Re-
alism: Ray tracing.
a collection of M = m3 cubic grid cells, where m is the num- coherence is effective especially for primary eye rays, hard
ber of grid cells in one dimension. Whereas the main scene shadow rays, soft shadow rays and many other kind of sec-
voxel is not cubic, we can still subdivide the main voxel in ondary rays. There are two main strategies to exploit this
cubic voxels using a different number of grid cell for each coherence:
dimension. We have M = mx my mz where mx , my and mz Ray aggregation, where a group of rays is traced as a single
represents the number of grid cells for each dimension. unit (a packet), and
Each cell is associated to a subset of the primitives that de- Beam tracing, where rays are considered only in a final step.
scribes the scene. Of course, a primitive can fill more than a The first strategy is the most used by current interactive
cell. systems. It is used with several data structures (i.e., kd-
2 Traversal algorithms tree [Wal06]), allows the use of register SIMD to process
four rays in bundle, and can be combined with frustum- and
In ray tracing, a traversal algorithm for a grid return all the interval- arithmetic techniques to avoid traversal steps and
voxels (cells) traversed by the ray. Formally, starting from a intersection tests by exploiting conservative bounds of the
parametric representation of a ray f (t) = ~o + t d~ where ~o group of rays [RSH05].
and d~ are the origin and direction vector, a traversal algo- Beam strategies does not represents explicitly rays until a
rithm returns all the cells traversed by the ray. An interval final sampling step. Overbeck et al. showed that beam trac-
[tmin ,tmax ) (usually [0, +)) is associated to a ray, and only ing is competitive with frustum based tracers, at least for
intersections within this interval are considered. Since we eyes and shadow rays [ORM07]. However, this work fo-
are looking for the closest primitive intersection, grid traver- cus on a kd-tree based data structure and, from the best of
sal stops as soon as an intersection is found. out knowledge, there are no known implementations of grid
Traversal algorithms in grid are similar to rasterization. Bre- based beam tracer.
senham 2D line rastering algorithm [Bre65], for example,
produce visually acceptable rasterized line in the shortest Traversal time. Traversal algorithm should be fast. A com-
time. However, it does not find all the pixels traversed by the mon cost model used for grid supposes that is possible to
line segment. Therefore a naive 3D extension of Bresenham identify, for each algorithm, two constants [ISP07]:
algorithm is not suitable as a grid traversal algorithm. Tsetup : time to setup traversal algorithm;
Tstep : time spent to advance from a cell to its neighbor
(without checking intersection).
The performance of a traversal algorithm depends directly
from this two conflictual values. For example, an algorithm
having a low Tsetup usually performs better for low resolu-
tion grids, whereas an high Tstep provide bad performance
Figure 1: (left) Bresenham cells. (right) Traversed cells. for larger grids.
Traversal algorithm can be classified with respect of sev- 2.1 Single ray traversal
eral aspects.
Several grid traversal algorithms have been historically de-
Discrete integer traversal. A traversal algorithm generate veloped to determinate the cells traversed by a ray in a
a sequence of cell traversed by a ray directly. We can dis- grid. Algorithms for grid traversal are quite similar to ras-
tinguish integer based algorithms, which assume that each terization algorithms (such as Bresenham line draw algo-
ray has a start and an end point, referred with integer coor- rithm [Bre65]). Many three dimensional traversal algorithms
dinates, (which corresponds to the first and the last cell to can be directly derived from an equivalent line drawing algo-
visit). On the other hand, there are algorithms where rays rithm. Indeed, most of them are variants of the Digital Dif-
have a starting point and direction expressed with floating ferential Analyzer (DDA).
point variables [Sra95]. In the latter case one has to pay at- We are going to give a brief outline of the most important
tention to the precision, reducing the numerical errors to the grid traversal algorithms. Our analysis will focus in the two
minimum. dimensions case. Anyway, extensions to three dimensional
space is usually simple and straight.
Ray aggregation strategies: exploiting spatial coherence.
Ray tracing methods usually use a geometric representa- Cleary and Wyvill algorithm [CW88]. Cleary and Wywill
tion of the 3D scene. Each geometric primitive supported showed that the next cell calculations can be done in two or
by a ray tracer implements a ray-primitive intersection al- three integer operations. We henceforth indicate with verti-
gorithm, and the purpose of the acceleration data structure cal (resp. horizontal) walls the vertical (resp. horizontal) axis
(such as grid) is to minimize the number of intersection tests belonging to the grid. When a ray enters in a new cell, it can
required. traverse an horizontal wall (i.e., from the bottom to the top)
Recently, an important class of techniques exploit spatial co- or a vertical wall (i.e from left to right). Considering only
herence to provide fast traversal performance. For example the passage through vertical walls, let x the distance along
in grids, close rays go through similar path of cells. This the ray between such crossings. Similarly, y is the constant
distance between successive crossings of horizontal walls. to determinate the remaining cells. However, this approach
In order to find the cell traversal order, we need to determi- has some drawbacks: a higher traversal time and an elabo-
nate the sequence of crossing type (horizontal or vertical). rate implementation.
This can be done by keeping two variables, dx and dy, which A more efficient extension of Bresenham algorithm is pro-
record the total distance along the ray, from the origin to the posed by Liu et al. [LZY04]. Bresenhams algorithm, chosen
next crossing of a vertical or horizontal wall respectively. If the driving axis, at each step selects a cell to traverse. The
dx < dy, then the next crossing is over a vertical wall, and the error e tells the difference between the chosen integer cell
next cell is the horizontal neighbor cell (see Figure 2). De- and the real coordinate of the line. The error e is compared
terminated the next cross type, traversal algorithm update dx against a threshold (1/2 in Bresenham) and then a cumula-
(dx = dx x). Similar operations are done on dy if dx > dy. tive value of the line segment slope is computed(e = e + k
Extending the algorithm to the three dimensions simply re- and k = dx/dy).
quires the addition of the appropriate dz and z variables, Liu extension, instead, determines the intersection point of
and finding the minimum of dx, dy and dz at each step. the line with the two walls, and then compare the error e
against a threshold = 21 2dx dy
. If e is greater than the thresh-
old, both (x + 1, y) and (x + 1, y + 1) are visited. Thereafter,
following the Bresenham approach, the algorithm is adjusted
to integer arithmetic. The three dimensional case is quite
similar, and identify two possible paths from the cell (x, y, z)
to (x + 1, y + 1, z + 1) (cfr. [LZY04]). Liu algorithm is faster
even on modern processors, where floating point arithmetic
is almost as efficient as integer arithmetic.
Other approaches. Line drawing algorithms raised histor-
ically a lot of interest by the Computer Graphics research
Figure 2: An example of Amanatides and Woo traversal algorithm community, and several approaches and solutions have been
with the corresponding cell traversal order. The Figure shows dx and proposed.
dy after traversing the third cell. For example, in symmetric algorithms (see [Wyv90]): since
lines are symmetric around the center, it is possible to draw
two (symmetric) pixel at time. Notice that, in symmetric al-
gorithms, the traversal algorithms starts from the midpoint
(pixel) and go trough the two outermost points, or viceversa,
starts from the outermost pixels and go trough the midpoint,
whereas, in general, traversal algorithm start from the origin
Figure 3: Bresenham (left) and Liu (right) thresholds. and eventually stop when an intersection is found.
Amanatides and Woo algorithm [AW87]. Amanatides and Other line drawing algorithms use several patterns to draw
Woo traversal algorithm differs from Cleary and Wywills al- more pixels at once. For example, instead of draw a pixel
gorithm because it is based on floating point values. Starting at once, they draw a pattern of three pixels, taken by four
from the parametric equation of the ray, with t 0, the algo- different precomputed patterns [Wyv90]. From the best of
rithm breaks down the ray into intervals of t, each of which our knowledge, uses and benefits of symmetric and pattern-
spans one cell. The floating point variable dx (dy) represents based algorithms, in the context of grid traversal, are not in-
the value of the parameter t where the ray hit the horizontal vestigated yet.
(vertical) wall. If the ray direction is normalized, dx (dy) is 3D Digital Differential Analyzer (3DDDA) traversal is ap-
also the distance from origin to the wall hit point. the three plicable for several data structure traversal algorithms. An
dimensional version of this traversal algorithm requires, in a interesting view of 3DDDA issue is done by Fujimoto et al.
traversal step, only two comparisons and one addition (float- [FTI]. They provides an environment for ray tracing named
ing point operations). SEADS. The paper analyzes uniform grids, but it also dis-
cusses the traversal algorithm for octrees. They argue that
Modified Bresenham algorithms. As discussed above, 3DDDA traversal algorithm for octrees is not as efficient as
standard Bresenham line drawing does not consider all the expected. They discuss the 3DDDA octree traversal in de-
cells traversed by a ray [Bre65]. Indeed, chosen the driving tails and compare it with the uniform grid 3DDDA traversal.
axis, e.g., the x axis and starting from the cell (x, y), Bresen- They also provides experimental results that show that uni-
hams algorithm choices only one cell between (x + 1, y) and form grid 3DDDA traversal is faster than Glassner octree
(x + 1, y + 1), never both. traversal [Gla84].
Zalik et al. solve this problem proposing a code-based traver-
sal algorithm [ZCO97]. This algorithm has two phases: first 2.2 Group of rays traversal
it determines the cells traversed by Bresenhams algorithm; Unfortunately, 3DDDA like algorithms used for single ray
then it examines the relationship between two successive cell grid traversal can not be easily improved using standard
Figure 4: Traversal technique strategies: (left) single step , (center) pattern based , and (right) slice by slice.
traversal optimization techniques (such as packet and frus- 3 Building algorithms
tum traversal). The main issue of extending traversal from Ize et al. [ISP07] defines a general approach to build flat
a single ray to a group of rays concerns about the cell tra- grids: based on some assumption on geometry distribution
versed by a group of rays. If the path of two close rays is they provides good estimation for grid resolution. Further-
quite similar, they can differs in a restricted number of cell more, the also shows a theoretical analysis for hierarchical
that a ray traverses and another does not. Single ray algo- grids. The analysis focus on two kind of scene models, the
rithm can chose only one cell at a time to step into, whereas first one assumes that compact triangles are comparable to
different rays can disagree on the next cell to be traversed. points while the second considers long-skinny triangles as
A naive solution to this problem is to split rays in differ- lines. An interesting property of grids is that the building
ent sub-packets with the same traversal decision. However, process is easy to parallelize [IWRP06].
rays that have diverged in a cell (and then are split in differ-
ent sub-packets) may traverse other common cells later on. In the following we consider a cubic main voxel con-
This loose of coherence can be avoided by re-merging sub- taining N geometric primitives describing the scene, and a
packets. Nevertheless, splitting and merging packets of rays given value for m. Two building algorithms have been pro-
are expensive operations and thus, this can not be a practical posed [IWRP06]:
solution. Sort-last building algorithms: for each primitive p, the al-
gorithm determinates the grid cells which contains p or a
Slice by slice algorithm. Ize. et al. [WIK 06] have solved portion of p;
this problem by abandoning 3DDDA like algorithm in fa- Sort-first building algorithms: for each grid cell g, the al-
vor of a new slice by slice traversal algorithm for group of gorithm determinates the primitives which have a non-
rays. The algorithm, known as CGT (Coherent Grid Traver- empty intersection with g.
sal), traverse the grid slice by slice rather than cell by cell, In this papers we refers to sort-last building algorithms
avoiding expensive merge and split operations. as general grid building algorithm. The algorithm can use an
The algorithm, given a set of coherent rays (rays that exact test or a faster but imprecise axis aligned bounding box
spans an angle of less than /2), computes the packets test. The latter is preferred in the context of interactive real
bounding frustum that is traversed through one slice at a time environments.
time. The major dominant axis is taken by selecting the dom- Algorithm 1 A general grid building algorithm
inant component of the direction of the first ray, and the re- 1: m determinatem()
maining two dimensions determinates the slices. For each 2: split grid in m3 cells
slice, a incrementally frustums overlap with the slice is per- 3: for all primitive p scene do
4: box p.boundingBox
formed, which determinates the cells actually overlapped by
5: insert p in grid cells from box.min to box.max
the frustum. Each ray packet have four corner rays which 6: end for
defines the frustum boundaries. An important feature is that
frustum traversal step can be done efficiently by using SIMD A general method, to build grids, is shortly summarized
instructions. in the algorithm determinatem (see Algorithm 1).
Although the number of overall ray-primitive intersection If we suppose that each primitive covers a constant num-
test can be higher, because the packet can traverse cells that ber of grid cells and assuming that determinatem is O(N),
some rays does not intersect, in practice ray coherence easily then one can easily verify that the building algorithm has
compensate this overhead. linear time complexity.
Ize et al. also extends this algorithm to hierarchical grids. 3.1 Grid building: preliminaries
They show significant test results in dynamic scenes, and In the following we will give some basic assumptions: We
competitive with Intel MLRT System based on kd-tree for assume that rays are uniformly distributed in the space,
static scene (actually the best known for static scenes) hence each cell has the same probability to be reached by
[RSH05]. An interesting consideration about CGT concerns a ray. A similar assumption is done by Surface Area Heuris-
about scalability with screen resolution. Since higher resolu- tic (SAH) heuristics; we assume to have a single ray 3DDDA
tions enable larger packets, we generally see sublinear scal- like traversal algorithm, and that it is possible to determinate
ing in screen resolution. the time required by atomic operations:
Tsetup : time to do the initial intersection with main voxel For line based models
bounding box and setup traversal; 3
T 2
Tstep : time spent to advance from a cell to its neighbor M = N inter = O(N 1.5 ). (2)
Tstep
(without checking intersection);
Tinter : time required to check a ray-primitive intersection. Commonly used models behave between this two opposite
We also suppose that each primitive covers a bounded ones. Anyway, there is a remarkable asymptotic difference
number of grid cells. In the following we denote by Nm the between the two models. Nevertheless, the use of these the-
number of intersections between primitives and cells for a oretical results requires a knowledge of scene model type.
given subdivision factor m. The number of cells covered by
a primitive depends by the scene model considered. For in- Mailboxing. Mailboxing [AW87] is a common technique
stance, if the model is based on lines, a primitive covers that avoid multiple identical ray-primitive intersection tests.
roughly m cells (Nm N m), where in a point-based model A revised approach that include mailboxing requires to re-
a primitive covers exactly 1 cell (Nm N). define Nm as the average number of distinct primitives con-
tained (or partially contained) in m cells. For sake of sim-
3.2 A traversal time cost-based model
plicity we do not consider this approach in this cost-based
In this section we shift out attention in the choice of the sub- model.
division resolution. We will show that this problem is not
trivial and it is strongly influenced by the scene model. The 3.3 Choosing the resolution
choice of the subdivision factor (i.e., m) affects traversal time The choice of m is fundamental in traversal performance. Ize
and memory space occupation. et al. [ISP07] work requires an a priori estimate of models
Defining a traversal time cost based approach, during the occupation.
building phase, is a common approach in ray tracing. It is possible to determinate an exact value of Nm for a given
In particular, in the context of the kd-tree building, split m by performing a primitive counting during the building.
planes are chosen by using a Surface Area Heuristic (SAH) This means that, for a given m, it is possible to evaluate
in order to evaluate possible candidate planes. However, ex- Nm in linear time. A guess and check building algorithm can
act SAH build algorithms have O(n log n) time complex- test several value of m in linear time and then select the one
ity, which is considered too high for interactive purpose. which minimize the cost function T . The theoretical analysis
A first complete cost model for grid has been introduced give us a bounded interval where look for a good value of m.
in [CW88]. As an example, assuming h that Tsetupi 1.0,Tstep 0.1 and
1 3
Tinter 0.5, then m (10N) 3 , 5N 2 .
Ize et al. [ISP07] introduced a clear and simple cost model
for grid. Following the above assumptions, a ray reaches m Sampling-based building algorithm. The idea of guess and
cells on average. The average traversal time T , in a flat grid check algorithms is to sample the cost function in order to
is: build a efficient grid. This idea is not new. For example,
T = Tsetup + mTstep + Tinter in the context of SAH kd-tree build [WH06], Popov et al.
where = Nm /m2 is the average number of primitives con- [PGSS06] provide theoretical and practical results regarding
tained (or partially contained) in m cells. conservatively sub-sampling of the SAH cost function in kd
The value of T strongly depends on the scene model, on tree. Hunt et al. [HMS06] define an adaptive error-bounded
the value of the subdivision factor m and on primitives cell heuristic based on a scanning-based algorithm for choosing
occupation (Nm ). kd-tree split planes that are close to optimal with respect to
the SAH criteria. Similar works exploit SAH build in other
In point based models, each primitive is contained in ex-
data structures such as BVH.
actly one cell (i.e.,Nm = N), and therefore each cell con-
tains N/m3 primitives on average. Since, we are looking Notice that our cost based grid building is a SAH algo-
for the number of primitives contained in m cells, we have rithm, where a grid resolution is weighted using the number
m = N/m2 . On the other hand, considering line based mod- of primitive contained in a cell. Because grid cells are vox-
els, each primitive is partially contained into m cells (Nm = els having all the same surface area, the probability that a
N m), hence m = N/m. ray hits a cell is equal for all the cells. Where in kd tree SAH
If we have enough information about the scene model, it heuristic evaluates the cost of a plane candidate, instead in
is possible to determinate the value of m which minimizes grid we evaluate a cost of a grid resolution (our m).
T [ISP07]. In particular, if we consider point based models, In this context we can use theoretical results to bound
then sampling and evaluate the
2T h cost1 functioni for several values
M = N inter = O(N). (1) 3
of m (in our case m (10N) 3 , 5N 2 ). A naive approach
Tstep
can sample this interval in a constant number of points, se-
We recall that Tinter and Tstep are constant.
lecting the resolution in linear time. Hence we have a linear
overall building time.
The traversal algorithm is unsuited for GPU computation, [KS97] K LIMASZEWSKI K. S., S EDERBERG T. W.: Faster ray
and branching is unavoidable.Today, for ray tracing, CPUs tracing using adaptive grids. IEEE Comput. Graph. Appl. 17, 1
seem to be more suitable than GPUs. (1997), 4251.
[Gla84] G LASSNER A. S.: Space subdivision for fast ray tracing. [Whi80] W HITTED T.: An improved illumination model for
IEEE Computer Graphics & Applications 4, 10 (Oct. 1984), 15 shaded display. Communications of the ACM 6, 23 (1980), 343
22. 349.
[Hav00] H AVRAN V.: Heuristic Ray Shooting Algorithms. Ph.d. [WIK 06] WALD I., I ZE T., K ENSLER A., K NOLL A., PARKER
thesis, Department of Computer Science and Engineering, Fac- S. G.: Ray tracing animated scenes using coherent grid traversal.
ulty of Electrical Engineering, Czech Technical University in ACM Transactions on Graphics 25, 3 (July 2006), 485493.
Prague, November 2000. [WMG 07] WALD I., M ARK W. R., G NTHER J., B OULOS S.,
[HMS06] H UNT W., M ARK W. R., S TOLL G.: Fast kd-tree con- I ZE T., H UNT W., PARKER S. G., S HIRLEY P.: State of the Art
struction with an adaptive error-bounded heuristic. In 2006 IEEE in Ray Tracing Animated Scenes. In Eurographics 2007 State of
Symposium on Interactive Ray Tracing (Sept. 2006). the Art Reports (2007).
[ISP07] I ZE T., S HIRLEY P., PARKER S. G.: Grid Creation [Wyv90] W YVILL B.: Symmetric double step line algorithm.
Strategies for Efficient Ray Tracing. In IEEE/EG Symposium on Graphics gems (1990), 101104.
Interactive Ray Tracing (Sept. 2007), pp. 2732. [ZCO97] Z ALIK B., C LAPWORTHY G., O BLONSEK C.: An ef-
[IWRP06] I ZE T., WALD I., ROBERTSON C., PARKER S. G.: ficient code-based voxel-traversing algorithm. Comput. Graph.
An Evaluation of Parallel Grid Construction for Ray Tracing Dy- Forum 16, 2 (1997), 119128.
namic Scenes. In Proceedings of the 2006 IEEE Symposium on
Interactive Ray Tracing (2006), pp. 2755.
[JW89] J EVANS D., W YVILL B.: Adaptive voxel subdivision for
ray tracing. In Graphics Interface (June 1989), pp. 164172.
Figure 5: Impact of resolution in four test scenes: two scanned models, an architectural model, a particle-based model. Of the two theoretical
model, our test suggest that real world scene are closer to point instead line model. In particular, particle based scenes are less sensitive to the
choice of resolution. As expected, architectural model have an higher traversal cost.