GG 400

Download as pdf or txt
Download as pdf or txt
You are on page 1of 8

Eurographics Italian Chapter Conference (2008)

V. Scarano, R. De Chiara, and U. Erra (Editors)

A Survey on Exploiting Grids for Ray Tracing


Biagio Cosenza

ISISLab, Dipartimento di Informatica ed Applicazioni R.M. Capocelli


Universit degli Studi di Salerno, Italy
[email protected]

Abstract
Grid is one of the first data structure introduced at the very beginning of computer graphics. Grids are used
in several applications of computer graphics, especially in rendering algorithms. Lately, in ray tracing dynamic
scenes, grid has received attention for its appealing linear time building time. In this paper, we aim to survey
several aspects behind the use of grids in ray tracing. In particular we investigate grid traversal algorithms,
building techniques and several approaches for hierarchical grids.
Categories and Subject Descriptors (according to ACM CCS): I.3.6 [Computer Graphics]: Methodology and Tech-
niques: Graphics data structures and data types I.3.7 [Computer Graphics]: Three-Dimensional Graphics and Re-
alism: Ray tracing.

1 Introduction As an example, the surface area heuristics, required to build


Several works (e.g., [Hav00]) have showed the significance fast-traversal kd-trees [WH06], may require seconds to min-
of data structures for rendering algorithms and in particular utes to build a kd-tree for quite complex scenes. There-
for ray tracing [Whi80]. Ray tracing is a widely used algo- fore kd-trees can be efficiently used only for static scenes.
rithm for rendering images aiming at an high realism. Ray This restriction limits the utility of kd-tree, as data struc-
tracing requires a big amount of computational power and ture for ray tracing, for many interactive scenarios, such as
therefore, a lot of research has been proposed to speed up visual simulation, animations, and interactive games. While
ray tracing algorithm by using additional acceleration data some efforts have focused on extending kd-trees to dynamic
structures, such as grids, octrees, bounding volume hierar- scenes [WMG 07], from the best of our knowledge, they are
chies, or kd-trees (see [Hav00]). In particular it has been limited to hierarchical motion or require advance knowledge
showed that, in a general kd-tree implementation for static of the scene, and therefore they are unsuitable for pure dy-
scenes, about 60% of time in ray tracing algorithm is spent namic animations that require unstructured motion.
in tree traversal [Wal06]. Grid data structures, introduced by [Gla84], divides the
Although, ray tracing used to be considered not suitable scene-space in uniform voxel spaces. In contrast with kd-
for interactive applications, recently, several works [Wal06, trees and other adaptive data structures, due to their lin-
RSH05, BSW06] have showed that it is possible to achieve ear time build algorithm, grids can be created from scratch
interactive performance, at least for static scene, by using and updated at every frame with interactive performance
several traversal optimization, such as packet and frustum [RSH00] (at least for moderate sized scenes). Consequently,
traversal. even if grids provide higher traversal time than kd-trees, they
have been considered attractive for dynamic scenes because
The choice for a efficient data structure become quite in-
of theirs faster building time.
tricate if one has to consider dynamic scenes. In that case
not only traversal time, but also building and updating time, Grid data structures are popular not only for rendering an-
have to be considered in the choice of the acceleration data alytical objects of a scene, but also for different scenarios
structure. Indeed, as showed in [WMG 07], ray racing of such as particle-based simulations, computational fluids dy-
dynamic scenes has shifted the attention from traversal time namics and volume rendering .
to build and update time. Assumptions on grid. Formally, a grid is a spatial data
In this context, grid data structures [Gla84] becomes structure that divides the scene-space in constant size voxels,
an appealing data structure: although kd-trees outperforms named cells. We assume to have a cubic main voxel that rep-
grids with respect traversal time, they are not suitable for dy- resents the whole scene. A grid equally subdivides the main
namic scenes, due to their high cost of rebuilding/updating. voxel along each dimension. Therefore, a grid can be seen as

c The Eurographics Association 2008.



90 Biagio Cosenza / A Survey on Exploiting Grids for Ray Tracing

a collection of M = m3 cubic grid cells, where m is the num- coherence is effective especially for primary eye rays, hard
ber of grid cells in one dimension. Whereas the main scene shadow rays, soft shadow rays and many other kind of sec-
voxel is not cubic, we can still subdivide the main voxel in ondary rays. There are two main strategies to exploit this
cubic voxels using a different number of grid cell for each coherence:
dimension. We have M = mx my mz where mx , my and mz Ray aggregation, where a group of rays is traced as a single
represents the number of grid cells for each dimension. unit (a packet), and
Each cell is associated to a subset of the primitives that de- Beam tracing, where rays are considered only in a final step.
scribes the scene. Of course, a primitive can fill more than a The first strategy is the most used by current interactive
cell. systems. It is used with several data structures (i.e., kd-
2 Traversal algorithms tree [Wal06]), allows the use of register SIMD to process
four rays in bundle, and can be combined with frustum- and
In ray tracing, a traversal algorithm for a grid return all the interval- arithmetic techniques to avoid traversal steps and
voxels (cells) traversed by the ray. Formally, starting from a intersection tests by exploiting conservative bounds of the
parametric representation of a ray f (t) = ~o + t d~ where ~o group of rays [RSH05].
and d~ are the origin and direction vector, a traversal algo- Beam strategies does not represents explicitly rays until a
rithm returns all the cells traversed by the ray. An interval final sampling step. Overbeck et al. showed that beam trac-
[tmin ,tmax ) (usually [0, +)) is associated to a ray, and only ing is competitive with frustum based tracers, at least for
intersections within this interval are considered. Since we eyes and shadow rays [ORM07]. However, this work fo-
are looking for the closest primitive intersection, grid traver- cus on a kd-tree based data structure and, from the best of
sal stops as soon as an intersection is found. out knowledge, there are no known implementations of grid
Traversal algorithms in grid are similar to rasterization. Bre- based beam tracer.
senham 2D line rastering algorithm [Bre65], for example,
produce visually acceptable rasterized line in the shortest Traversal time. Traversal algorithm should be fast. A com-
time. However, it does not find all the pixels traversed by the mon cost model used for grid supposes that is possible to
line segment. Therefore a naive 3D extension of Bresenham identify, for each algorithm, two constants [ISP07]:
algorithm is not suitable as a grid traversal algorithm. Tsetup : time to setup traversal algorithm;
Tstep : time spent to advance from a cell to its neighbor
(without checking intersection).
The performance of a traversal algorithm depends directly
from this two conflictual values. For example, an algorithm
having a low Tsetup usually performs better for low resolu-
tion grids, whereas an high Tstep provide bad performance
Figure 1: (left) Bresenham cells. (right) Traversed cells. for larger grids.
Traversal algorithm can be classified with respect of sev- 2.1 Single ray traversal
eral aspects.
Several grid traversal algorithms have been historically de-
Discrete integer traversal. A traversal algorithm generate veloped to determinate the cells traversed by a ray in a
a sequence of cell traversed by a ray directly. We can dis- grid. Algorithms for grid traversal are quite similar to ras-
tinguish integer based algorithms, which assume that each terization algorithms (such as Bresenham line draw algo-
ray has a start and an end point, referred with integer coor- rithm [Bre65]). Many three dimensional traversal algorithms
dinates, (which corresponds to the first and the last cell to can be directly derived from an equivalent line drawing algo-
visit). On the other hand, there are algorithms where rays rithm. Indeed, most of them are variants of the Digital Dif-
have a starting point and direction expressed with floating ferential Analyzer (DDA).
point variables [Sra95]. In the latter case one has to pay at- We are going to give a brief outline of the most important
tention to the precision, reducing the numerical errors to the grid traversal algorithms. Our analysis will focus in the two
minimum. dimensions case. Anyway, extensions to three dimensional
space is usually simple and straight.
Ray aggregation strategies: exploiting spatial coherence.
Ray tracing methods usually use a geometric representa- Cleary and Wyvill algorithm [CW88]. Cleary and Wywill
tion of the 3D scene. Each geometric primitive supported showed that the next cell calculations can be done in two or
by a ray tracer implements a ray-primitive intersection al- three integer operations. We henceforth indicate with verti-
gorithm, and the purpose of the acceleration data structure cal (resp. horizontal) walls the vertical (resp. horizontal) axis
(such as grid) is to minimize the number of intersection tests belonging to the grid. When a ray enters in a new cell, it can
required. traverse an horizontal wall (i.e., from the bottom to the top)
Recently, an important class of techniques exploit spatial co- or a vertical wall (i.e from left to right). Considering only
herence to provide fast traversal performance. For example the passage through vertical walls, let x the distance along
in grids, close rays go through similar path of cells. This the ray between such crossings. Similarly, y is the constant

c The Eurographics Association 2008.



Biagio Cosenza / A Survey on Exploiting Grids for Ray Tracing 91

distance between successive crossings of horizontal walls. to determinate the remaining cells. However, this approach
In order to find the cell traversal order, we need to determi- has some drawbacks: a higher traversal time and an elabo-
nate the sequence of crossing type (horizontal or vertical). rate implementation.
This can be done by keeping two variables, dx and dy, which A more efficient extension of Bresenham algorithm is pro-
record the total distance along the ray, from the origin to the posed by Liu et al. [LZY04]. Bresenhams algorithm, chosen
next crossing of a vertical or horizontal wall respectively. If the driving axis, at each step selects a cell to traverse. The
dx < dy, then the next crossing is over a vertical wall, and the error e tells the difference between the chosen integer cell
next cell is the horizontal neighbor cell (see Figure 2). De- and the real coordinate of the line. The error e is compared
terminated the next cross type, traversal algorithm update dx against a threshold (1/2 in Bresenham) and then a cumula-
(dx = dx x). Similar operations are done on dy if dx > dy. tive value of the line segment slope is computed(e = e + k
Extending the algorithm to the three dimensions simply re- and k = dx/dy).
quires the addition of the appropriate dz and z variables, Liu extension, instead, determines the intersection point of
and finding the minimum of dx, dy and dz at each step. the line with the two walls, and then compare the error e
against a threshold = 21 2dx dy
. If e is greater than the thresh-
old, both (x + 1, y) and (x + 1, y + 1) are visited. Thereafter,
following the Bresenham approach, the algorithm is adjusted
to integer arithmetic. The three dimensional case is quite
similar, and identify two possible paths from the cell (x, y, z)
to (x + 1, y + 1, z + 1) (cfr. [LZY04]). Liu algorithm is faster
even on modern processors, where floating point arithmetic
is almost as efficient as integer arithmetic.
Other approaches. Line drawing algorithms raised histor-
ically a lot of interest by the Computer Graphics research
Figure 2: An example of Amanatides and Woo traversal algorithm community, and several approaches and solutions have been
with the corresponding cell traversal order. The Figure shows dx and proposed.
dy after traversing the third cell. For example, in symmetric algorithms (see [Wyv90]): since
lines are symmetric around the center, it is possible to draw
two (symmetric) pixel at time. Notice that, in symmetric al-
gorithms, the traversal algorithms starts from the midpoint
(pixel) and go trough the two outermost points, or viceversa,
starts from the outermost pixels and go trough the midpoint,
whereas, in general, traversal algorithm start from the origin
Figure 3: Bresenham (left) and Liu (right) thresholds. and eventually stop when an intersection is found.
Amanatides and Woo algorithm [AW87]. Amanatides and Other line drawing algorithms use several patterns to draw
Woo traversal algorithm differs from Cleary and Wywills al- more pixels at once. For example, instead of draw a pixel
gorithm because it is based on floating point values. Starting at once, they draw a pattern of three pixels, taken by four
from the parametric equation of the ray, with t 0, the algo- different precomputed patterns [Wyv90]. From the best of
rithm breaks down the ray into intervals of t, each of which our knowledge, uses and benefits of symmetric and pattern-
spans one cell. The floating point variable dx (dy) represents based algorithms, in the context of grid traversal, are not in-
the value of the parameter t where the ray hit the horizontal vestigated yet.
(vertical) wall. If the ray direction is normalized, dx (dy) is 3D Digital Differential Analyzer (3DDDA) traversal is ap-
also the distance from origin to the wall hit point. the three plicable for several data structure traversal algorithms. An
dimensional version of this traversal algorithm requires, in a interesting view of 3DDDA issue is done by Fujimoto et al.
traversal step, only two comparisons and one addition (float- [FTI]. They provides an environment for ray tracing named
ing point operations). SEADS. The paper analyzes uniform grids, but it also dis-
cusses the traversal algorithm for octrees. They argue that
Modified Bresenham algorithms. As discussed above, 3DDDA traversal algorithm for octrees is not as efficient as
standard Bresenham line drawing does not consider all the expected. They discuss the 3DDDA octree traversal in de-
cells traversed by a ray [Bre65]. Indeed, chosen the driving tails and compare it with the uniform grid 3DDDA traversal.
axis, e.g., the x axis and starting from the cell (x, y), Bresen- They also provides experimental results that show that uni-
hams algorithm choices only one cell between (x + 1, y) and form grid 3DDDA traversal is faster than Glassner octree
(x + 1, y + 1), never both. traversal [Gla84].
Zalik et al. solve this problem proposing a code-based traver-
sal algorithm [ZCO97]. This algorithm has two phases: first 2.2 Group of rays traversal
it determines the cells traversed by Bresenhams algorithm; Unfortunately, 3DDDA like algorithms used for single ray
then it examines the relationship between two successive cell grid traversal can not be easily improved using standard

c The Eurographics Association 2008.



92 Biagio Cosenza / A Survey on Exploiting Grids for Ray Tracing

Figure 4: Traversal technique strategies: (left) single step , (center) pattern based , and (right) slice by slice.
traversal optimization techniques (such as packet and frus- 3 Building algorithms
tum traversal). The main issue of extending traversal from Ize et al. [ISP07] defines a general approach to build flat
a single ray to a group of rays concerns about the cell tra- grids: based on some assumption on geometry distribution
versed by a group of rays. If the path of two close rays is they provides good estimation for grid resolution. Further-
quite similar, they can differs in a restricted number of cell more, the also shows a theoretical analysis for hierarchical
that a ray traverses and another does not. Single ray algo- grids. The analysis focus on two kind of scene models, the
rithm can chose only one cell at a time to step into, whereas first one assumes that compact triangles are comparable to
different rays can disagree on the next cell to be traversed. points while the second considers long-skinny triangles as
A naive solution to this problem is to split rays in differ- lines. An interesting property of grids is that the building
ent sub-packets with the same traversal decision. However, process is easy to parallelize [IWRP06].
rays that have diverged in a cell (and then are split in differ-
ent sub-packets) may traverse other common cells later on. In the following we consider a cubic main voxel con-
This loose of coherence can be avoided by re-merging sub- taining N geometric primitives describing the scene, and a
packets. Nevertheless, splitting and merging packets of rays given value for m. Two building algorithms have been pro-
are expensive operations and thus, this can not be a practical posed [IWRP06]:
solution. Sort-last building algorithms: for each primitive p, the al-
gorithm determinates the grid cells which contains p or a
Slice by slice algorithm. Ize. et al. [WIK 06] have solved portion of p;
this problem by abandoning 3DDDA like algorithm in fa- Sort-first building algorithms: for each grid cell g, the al-
vor of a new slice by slice traversal algorithm for group of gorithm determinates the primitives which have a non-
rays. The algorithm, known as CGT (Coherent Grid Traver- empty intersection with g.
sal), traverse the grid slice by slice rather than cell by cell, In this papers we refers to sort-last building algorithms
avoiding expensive merge and split operations. as general grid building algorithm. The algorithm can use an
The algorithm, given a set of coherent rays (rays that exact test or a faster but imprecise axis aligned bounding box
spans an angle of less than /2), computes the packets test. The latter is preferred in the context of interactive real
bounding frustum that is traversed through one slice at a time environments.
time. The major dominant axis is taken by selecting the dom- Algorithm 1 A general grid building algorithm
inant component of the direction of the first ray, and the re- 1: m determinatem()
maining two dimensions determinates the slices. For each 2: split grid in m3 cells
slice, a incrementally frustums overlap with the slice is per- 3: for all primitive p scene do
4: box p.boundingBox
formed, which determinates the cells actually overlapped by
5: insert p in grid cells from box.min to box.max
the frustum. Each ray packet have four corner rays which 6: end for
defines the frustum boundaries. An important feature is that
frustum traversal step can be done efficiently by using SIMD A general method, to build grids, is shortly summarized
instructions. in the algorithm determinatem (see Algorithm 1).
Although the number of overall ray-primitive intersection If we suppose that each primitive covers a constant num-
test can be higher, because the packet can traverse cells that ber of grid cells and assuming that determinatem is O(N),
some rays does not intersect, in practice ray coherence easily then one can easily verify that the building algorithm has
compensate this overhead. linear time complexity.
Ize et al. also extends this algorithm to hierarchical grids. 3.1 Grid building: preliminaries
They show significant test results in dynamic scenes, and In the following we will give some basic assumptions: We
competitive with Intel MLRT System based on kd-tree for assume that rays are uniformly distributed in the space,
static scene (actually the best known for static scenes) hence each cell has the same probability to be reached by
[RSH05]. An interesting consideration about CGT concerns a ray. A similar assumption is done by Surface Area Heuris-
about scalability with screen resolution. Since higher resolu- tic (SAH) heuristics; we assume to have a single ray 3DDDA
tions enable larger packets, we generally see sublinear scal- like traversal algorithm, and that it is possible to determinate
ing in screen resolution. the time required by atomic operations:

c The Eurographics Association 2008.



Biagio Cosenza / A Survey on Exploiting Grids for Ray Tracing 93

Tsetup : time to do the initial intersection with main voxel For line based models
bounding box and setup traversal;  3
T 2

Tstep : time spent to advance from a cell to its neighbor M = N inter = O(N 1.5 ). (2)
Tstep
(without checking intersection);
Tinter : time required to check a ray-primitive intersection. Commonly used models behave between this two opposite
We also suppose that each primitive covers a bounded ones. Anyway, there is a remarkable asymptotic difference
number of grid cells. In the following we denote by Nm the between the two models. Nevertheless, the use of these the-
number of intersections between primitives and cells for a oretical results requires a knowledge of scene model type.
given subdivision factor m. The number of cells covered by
a primitive depends by the scene model considered. For in- Mailboxing. Mailboxing [AW87] is a common technique
stance, if the model is based on lines, a primitive covers that avoid multiple identical ray-primitive intersection tests.
roughly m cells (Nm N m), where in a point-based model A revised approach that include mailboxing requires to re-
a primitive covers exactly 1 cell (Nm N). define Nm as the average number of distinct primitives con-
tained (or partially contained) in m cells. For sake of sim-
3.2 A traversal time cost-based model
plicity we do not consider this approach in this cost-based
In this section we shift out attention in the choice of the sub- model.
division resolution. We will show that this problem is not
trivial and it is strongly influenced by the scene model. The 3.3 Choosing the resolution
choice of the subdivision factor (i.e., m) affects traversal time The choice of m is fundamental in traversal performance. Ize
and memory space occupation. et al. [ISP07] work requires an a priori estimate of models
Defining a traversal time cost based approach, during the occupation.
building phase, is a common approach in ray tracing. It is possible to determinate an exact value of Nm for a given
In particular, in the context of the kd-tree building, split m by performing a primitive counting during the building.
planes are chosen by using a Surface Area Heuristic (SAH) This means that, for a given m, it is possible to evaluate
in order to evaluate possible candidate planes. However, ex- Nm in linear time. A guess and check building algorithm can
act SAH build algorithms have O(n log n) time complex- test several value of m in linear time and then select the one
ity, which is considered too high for interactive purpose. which minimize the cost function T . The theoretical analysis
A first complete cost model for grid has been introduced give us a bounded interval where look for a good value of m.
in [CW88]. As an example, assuming h that Tsetupi 1.0,Tstep 0.1 and
1 3
Tinter 0.5, then m (10N) 3 , 5N 2 .
Ize et al. [ISP07] introduced a clear and simple cost model
for grid. Following the above assumptions, a ray reaches m Sampling-based building algorithm. The idea of guess and
cells on average. The average traversal time T , in a flat grid check algorithms is to sample the cost function in order to
is: build a efficient grid. This idea is not new. For example,
T = Tsetup + mTstep + Tinter in the context of SAH kd-tree build [WH06], Popov et al.
where = Nm /m2 is the average number of primitives con- [PGSS06] provide theoretical and practical results regarding
tained (or partially contained) in m cells. conservatively sub-sampling of the SAH cost function in kd
The value of T strongly depends on the scene model, on tree. Hunt et al. [HMS06] define an adaptive error-bounded
the value of the subdivision factor m and on primitives cell heuristic based on a scanning-based algorithm for choosing
occupation (Nm ). kd-tree split planes that are close to optimal with respect to
the SAH criteria. Similar works exploit SAH build in other
In point based models, each primitive is contained in ex-
data structures such as BVH.
actly one cell (i.e.,Nm = N), and therefore each cell con-
tains N/m3 primitives on average. Since, we are looking Notice that our cost based grid building is a SAH algo-
for the number of primitives contained in m cells, we have rithm, where a grid resolution is weighted using the number
m = N/m2 . On the other hand, considering line based mod- of primitive contained in a cell. Because grid cells are vox-
els, each primitive is partially contained into m cells (Nm = els having all the same surface area, the probability that a
N m), hence m = N/m. ray hits a cell is equal for all the cells. Where in kd tree SAH
If we have enough information about the scene model, it heuristic evaluates the cost of a plane candidate, instead in
is possible to determinate the value of m which minimizes grid we evaluate a cost of a grid resolution (our m).
T [ISP07]. In particular, if we consider point based models, In this context we can use theoretical results to bound
then sampling and evaluate the
2T h cost1 functioni for several values
M = N inter = O(N). (1) 3
of m (in our case m (10N) 3 , 5N 2 ). A naive approach
Tstep
can sample this interval in a constant number of points, se-
We recall that Tinter and Tstep are constant.
lecting the resolution in linear time. Hence we have a linear
overall building time.

c The Eurographics Association 2008.



94 Biagio Cosenza / A Survey on Exploiting Grids for Ray Tracing

4 Hierarchical grids variable (adaptive) voxel resolution.


One of the most issues on designing a grid is that it lacks Notice that build approach is top-down This integration of
to adapt on geometry distribution. Adaptive subdivision of- regular and adaptive spatial subdivision methods allows im-
ten works better for complex scenes with uneven geometry ages consisting of large regularly distributed objects and
distribution, but generally are harder to build. Hierarchical small dense objects to be ray traced efficiently. The param-
grids (see e.g. [RSH00]) overcome this problem by using a eters controlling the coarseness of the voxel grid, depth of
recursive data structure. They provide a trade-off between adaptive subdivision trees, and maximum number of poly-
traversal and building time. There are several ways to build gons per voxel are tailored and their effects on execution
hierarchical grids. The main idea is to subdivide some re- time, subdivision time, and memory use are measured.
gions of space finer than others, and thus quickly traverse Macrocell or multigrids. Parker et al. [PPL 99] use a sim-
empty spaces ple hierarchical optimization to a base uniform grid, called
We are going to give a brief outline of the most important macrocell. Macrocells superimpose a second, coarser grid
ways to build hierarchical grid. over the original fine grid, such that each macrocell corre-
spond to an AxAxA block of original grid cells. Each macro-
Loosely nested grids. Cazals et al. [CDP95] proposes a hier- cell stores a boolean flag specifying whether any of its corre-
archical grid building which is able to handle very complex sponding grid cells are occupied. Parker also organizes cell
scene. In particular they propose a four step algorithm: in bricks to improve locality. Build process is top-down, sim-
organizes primitives of the scene in subset of similar size; ilar to multi-resolution grids.
for each group of primitives, group the neighbors into
clusters; Other works address the problem of adaptivity with dif-
construct a grid for each cluster; ferent solutions. For example, in proximity clouds [CS94], a
construct a hierarchy of these grids. ray traversing empty space is assisted by the distance values
The proposed data structure can be seen as a loosely nested which permit to perform long skips along the ray direction.
recursive grid. Filtering and clustering steps effect a bottom- Cost model suggests a way to build hierarchical grid and
up construction, in strong constrast with all other methods in particular to determinate a termination criteria to stop re-
that subdivide their structure adaptively in a top-down man- cursion. For example, if Tsetup + mTstep + Tinter > NTinter ,
ner. then recursion introduce a benefit in grid traversal time (see
Klimaszewski and Sederberg [KS97] propose an adaptive Subsection 3.2). However it is worth noting that, without us-
grid data structure. They suggest to build local grids that acts ing a good indexing strategy (such as Jevans and Wyvill hash
as a bounding box in densely populated areas. They use a fast table [JW89]) is not possible use to deep levels of recursion.
bottom-up building algorithm: 5 Hardware architectures considerations
for each primitive, surrounds it with a bounding box (un-
The current trend in CPU and GPU hardware design is to-
structured grids);
wards three concepts: a streaming compute model, vector-
for all bounding boxes, merge close boxes (structured
like SIMD units, and multi-core architectures. These new ar-
grid);
chitectures combines more and more parallel computations,
for all remaining boxes, insert the box into tree using a
indeed fast local computations and slow memory access
minimum surface criterion;
time. This leads to favor method that reduces memory ac-
for all bounding boxes in hierarchy, build a local grid.
cesses (i.e. compact data structure, compression, multireso-
Again, this data structure can be seen as a loosely nested
lution ...), maximize local memory accesses, and can be eas-
recursive grid.
ily parallelized, especially in SIMD/MIMD setting. For ex-
Both works seem to have a greater build cost against gen- ample memory layouts (i.e. bricking [PPL 99]) should have
eral bottom-up strategies. big and bigger importance in order to improve the perfor-
mance. This is also more important for the streaming com-
Multi-resolution grids. Jevans and Wyvill [JW89] illus-
pute model supported by nowadays hardwares.
trates a recursive hierarchical subdivision algorithm for
grids. All the primitives are initially inserted into a single Moreover, the current trend is to implement rendering
voxel. Thera are two kind of voxel: leaf voxel, which con- methods on GPUs. However, GPU presents some limita-
tains a list of objects inside that voxel and internal voxel, tions. For example, fragment programs are not allowed to
which maintains a voxel grid subdivision. They use an perform data depending branching and have a more re-
hashtable which maintains each non-empty internal voxel. stricted memory access (i.e. the number of level of de-
They propose several build methods: pendent texture fetch is limited). Purcell et al. [PBMH02]
octree like, setting the size of voxel grids to 2 on a side; showed that is possible to do ray tracing with programmable
setting a maximum depth of the subdivision tree; graphics hardware. Further works have been made by us-
fixing the resolution of the voxel subgrids to a constant ing different data structure such as kd-tree or BVH. Results
value; showed that ray tracing so far does not fully utilize GPU.

c The Eurographics Association 2008.



Biagio Cosenza / A Survey on Exploiting Grids for Ray Tracing 95

The traversal algorithm is unsuited for GPU computation, [KS97] K LIMASZEWSKI K. S., S EDERBERG T. W.: Faster ray
and branching is unavoidable.Today, for ray tracing, CPUs tracing using adaptive grids. IEEE Comput. Graph. Appl. 17, 1
seem to be more suitable than GPUs. (1997), 4251.

6 Conclusion [LZY04] L IU Y. K., Z ALIK B., YANG H.: An integer one-


pass algorithm for voxel traversal. Comput. Graph. Forum 23,
This paper presents a survey on algorithms used in ray trac- 2 (2004), 167172.
ing by using grid data structure.
[ORM07] OVERBECK 1 R., R AMAMOORTHI 1 R., M ARK W. R.:
Several aspects have been examined: how to efficient A Real-time Beam Tracer with Application to Exact Soft Shad-
traverse a grid, building methodology and their impact in ows. In Eurographics Symposium on Rendering (2007).
traversal algorithms, how it is possible to exploit adaptivity [PBMH02] P URCELL T. J., B UCK I., M ARK W. R., H ANRA -
in grid by building nested hyerarchical grid. HAN P.: Ray tracing on programmable graphics hardware. ACM
Transactions on Graphics 21, 3 (July 2002), 703712.
We believe that two main trends will drive research in this
data structure. The first is the need of rendering dynamic [PGSS06] P OPOV S., G NTHER J., S EIDEL H.-P., S LUSALLEK
scenes, because faster grid build time. The latter is the up- P.: Experiences with streaming construction of SAH KD-trees.
raise of parallel hardware, because of the easy paralleliza- In Proceedings of the 2006 IEEE Symposium on Interactive Ray
Tracing (Sept. 2006), pp. 8994.
tion.
References [PPL 99] PARKER S., PARKER M., L IVNAT Y., S LOAN P.-P.,
H ANSEN C., S HIRLEY P. S.: Interactive ray tracing for volume
[AW87] A MANATIDES J., W OO A.: A fast voxel traversal algo- visualization. IEEE Transactions on Visualization and Computer
rithm for ray tracing. In Eurographics 87 (Aug. 1987), pp. 310. Graphics 5, 3 (July/Sept. 1999), 238250.
[Bre65] B RESENHAM J.: Algorithm for computer control of a [RSH00] R EINHARD E., S MITS B., H ANSEN C.: Dynamic ac-
digital plotter. IBM Systems Journal 4, 1 (1965), 2530. celeration structures for interactive ray tracing. In Eurographics
Workshop on Rendering (2000), pp. 299306.
[BSW06] B OULOS S., S HIRLEY P., WALD I.: Geometric and
Arithmetic Culling Methods for Entire Ray Packets. Tech. rep., [RSH05] R ESHETOV A., S OUPIKOV A., H URLEY J.: Multi-level
University of Utah, SCI Institute, 2006. ray tracing algorithm. ACM Transactions on Graphics 24, 3
(Aug. 2005), 11761185.
[CDP95] C AZALS F., D RETTAKIS G., P UECH C.: Filtering, clus-
tering and hierarchy construction: a new solution for ray tracing [Sra95] S RAMEK M.: A comparison of some ray tracing gener-
complex scenes. Computer Graphics Forum 14, 3 (1995). ators for ray tracing volumetric data. In The Third International
[CS94] C OHEN D., S HEFFER Z.: Proximity clouds an acceler- Conference in Central Europe on Computer Graphics and Visu-
ation technique for 3d grid traversal. Vis. Comput. 11, 1 (1994), alization 1995 (1995), pp. 466475.
2738. [Wal06] WALD I.: Realtime Ray Tracing and Interactive Global
[CW88] C LEARY J. G., W YVILL G.: Analysis of an algorithm Illumination. PhD thesis, Computer Graphics Group, Saarland
for fast ray tracing using uniform space subdivision. The Visual University, Saarbrcken, Germany, 2006.
Computer 4, 2 (July 1988), 6583. [WH06] WALD I., H AVRAN V.: On building fast kd-trees for ray
[FTI] F UJIMOTO A., TANAKA T., I WATA K.: Arts: Accelerated tracing, and on doing that in O(N log N). In Proceedings of the
ray-tracing system. IEEE Computer Graphics & Applications. 2006 IEEE Symposium on Interactive Ray Tracing (2006).

[Gla84] G LASSNER A. S.: Space subdivision for fast ray tracing. [Whi80] W HITTED T.: An improved illumination model for
IEEE Computer Graphics & Applications 4, 10 (Oct. 1984), 15 shaded display. Communications of the ACM 6, 23 (1980), 343
22. 349.

[Hav00] H AVRAN V.: Heuristic Ray Shooting Algorithms. Ph.d. [WIK 06] WALD I., I ZE T., K ENSLER A., K NOLL A., PARKER
thesis, Department of Computer Science and Engineering, Fac- S. G.: Ray tracing animated scenes using coherent grid traversal.
ulty of Electrical Engineering, Czech Technical University in ACM Transactions on Graphics 25, 3 (July 2006), 485493.
Prague, November 2000. [WMG 07] WALD I., M ARK W. R., G NTHER J., B OULOS S.,
[HMS06] H UNT W., M ARK W. R., S TOLL G.: Fast kd-tree con- I ZE T., H UNT W., PARKER S. G., S HIRLEY P.: State of the Art
struction with an adaptive error-bounded heuristic. In 2006 IEEE in Ray Tracing Animated Scenes. In Eurographics 2007 State of
Symposium on Interactive Ray Tracing (Sept. 2006). the Art Reports (2007).
[ISP07] I ZE T., S HIRLEY P., PARKER S. G.: Grid Creation [Wyv90] W YVILL B.: Symmetric double step line algorithm.
Strategies for Efficient Ray Tracing. In IEEE/EG Symposium on Graphics gems (1990), 101104.
Interactive Ray Tracing (Sept. 2007), pp. 2732. [ZCO97] Z ALIK B., C LAPWORTHY G., O BLONSEK C.: An ef-
[IWRP06] I ZE T., WALD I., ROBERTSON C., PARKER S. G.: ficient code-based voxel-traversing algorithm. Comput. Graph.
An Evaluation of Parallel Grid Construction for Ray Tracing Dy- Forum 16, 2 (1997), 119128.
namic Scenes. In Proceedings of the 2006 IEEE Symposium on
Interactive Ray Tracing (2006), pp. 2755.
[JW89] J EVANS D., W YVILL B.: Adaptive voxel subdivision for
ray tracing. In Graphics Interface (June 1989), pp. 164172.

c The Eurographics Association 2008.



96 Biagio Cosenza / A Survey on Exploiting Grids for Ray Tracing

Figure 5: Impact of resolution in four test scenes: two scanned models, an architectural model, a particle-based model. Of the two theoretical
model, our test suggest that real world scene are closer to point instead line model. In particular, particle based scenes are less sensitive to the
choice of resolution. As expected, architectural model have an higher traversal cost.

c The Eurographics Association 2008.

You might also like