Electronics 10 02730
Electronics 10 02730
Article
Raymarching Distance Fields with CUDA
Avelina Hadji-Kyriacou and Ognjen Arandjelović *
School of Computer Science, University of St Andrews, St Andrews KY16 9SX, UK; [email protected]
* Correspondence: [email protected]
Abstract: Raymarching is a technique for rendering implicit surfaces using signed distance fields. It
has been known and used since the 1980s for rendering fractals and CSG (constructive solid geometry)
surfaces, but has rarely been used for commercial rendering applications such as film and 3D games.
Raymarching was first used for photorealistic rendering in the mid 2000s by demoscene developers
and hobbyist graphics programmers, receiving little to no attention from the academic community
and professional graphics engineers. In the present work, we explain why the use of Simple and Fast
Multimedia Library (SFML) by nearly all existing approaches leads to a number of inefficiencies,
and hence set out to develop a CUDA oriented approach instead. We next show that the usual data
handling pipeline leads to further unnecessary data flow overheads and therefore propose a novel
pipeline structure that eliminates much of redundancy in the manner in which data are processed
and passed. We proceed to introduce a series of data structures which were designed with the specific
aim of exploiting the pipeline’s strengths in terms of efficiency while achieving a high degree of
photorealism, as well as the accompanying models and optimizations that ultimately result in an
engine which is capable of photorealistic and real-time rendering on complex scenes and arbitrary
objects. Lastly, the effectiveness of our framework is demonstrated in a series of experiments which
compare our engine both in terms of visual fidelity and computational efficiency with the leading
commercial and open source solutions, namely Unreal Engine and Blender.
Keywords: rendering; sphere tracing; ray tracing; graphics; photorealism; CUDA kernels; acceleration
2. Proposed Framework
For both rasterization and ray tracing engines, there are multiple pipeline architectures,
all with different strengths and weaknesses. However, raymarching is largely unexplored
territory in this regard. The pipeline developed herein is summarized in Figure 1. Unlike
most graphical engines, the render pipeline for this engine resides mostly on the graphics
card, eliminating one of the bottlenecks present in many graphics engines: data transfer
and API calls from host to device.
Electronics 2021, 10, 2730 4 of 26
Figure 1. The render pipeline for the proposed engine resides mostly on the graphics card, eliminating
one of the bottlenecks due to data transfer and API calls from host to device. Light and object are
stored in vectors on the host and copied once per frame to device memory. Textures are uploaded
into VRAM and exposed using a texture object descriptor.
In a typical raster engine data such as textures, meshes and shader code are uploaded
and cached in device VRAM (video random access memory), but the model that controls
how these data are assembled in a 3D world is completely controlled by the host. The
host assembles all this data with what are known as “draw calls”. A draw call itself often
consists of multiple API commands, such as binding textures, switching shader programs
or specifying vertex buffers. A bottleneck occurs when the host issues commands faster
than the device can complete the work. One solution to this would be batching large
amounts of work into a single call, known as instancing; for example, draw calls in a
loop where only some parameters change between each iteration can be replaced by a
single draw call that also takes a list of the precomputed parameters for all instances in an
array. Instancing only works in cases where only minor parameters are changed between
each instance, such as object transforms or more abstract parameters used by shaders.
Instancing, however, is not suitable to render different types of objects: it would not be
possible to batch draw calls corresponding to drawing a tree with draw calls corresponding
to a table, or in cases where the objects require different shaders or different buffers to
be bound.
Our engine has no concept of draw calls which eliminates the CPU bottleneck totally.
Instead of issuing commands which need to be processed by the graphics driver, the engine
instead uploads arrays of parameters in batches directly from host memory into device
memory on the GPU which the model uses to generate the scene. The model itself is
defined in device code, in contrast to a raster engine where the application model is defined
by the commands issued by host code.
(a) (b)
Figure 2. (a) SFML frame memory, and (b) OpenGL-CUDA frame memory.
Once the PBO is written to in the OpenGL address space, the PBO needs to be drawn
to the screen. There are are a couple different approaches: the first approach would be to
prepare a Vertex Buffer Object (VBO) representing a rectangle that covers the screen with
UV coordinates normalised to [0 . . . 1] in both axes and then use a trivial vertex shader and
fragment shader to display the rectangle with the PBO as a texture to screen. The second
approach—used herein—is to create an explicit Frame Buffer Object (FBO) for which the
PBO can be built into every frame, avoiding the need for using the rasterization hardware
of the GPU to display the texture. This requires significantly more setup than the SFML
approach but allows for greater control over the rendering pipeline, e.g., the choice of the
bit depth of buffers for a trade-off between memory usage and quantization artefacts.
All textures are created as mipmapped texture objects; a simple kernel can be run
to asynchronously compute the mipmaps for each texture during loading. Textures are
managed by BindlessTexture objects which utilise C++ templating to support textures of
different formats (e.g., unsigned char, unsigned short, etc.) and internal formats (float1,
float4, uchar4, etc.). Once a texture has been loaded, it is stored in a linked list to keep track
of the texture objects and mipmap arrays which need to be freed on program closure (or
when no longer needed). To read a texture from device code, the only value that needs to
be known is the “tex” variable.
• colour—a homogeneous 3D vector for the colour of the light. The 4th component
corresponds to the inverse of the light intensity. In the material editor, this is displayed
as two separate values; a linear colour RGB value and a linear intensity value.
• attenuation—a 3 component vector corresponding to constant, linear and quadratic
attenuation for the light. If a light is a sky light, the attenuation values are ignored
and a value of [1,0,0] is used since the light is considered to be infinitely distant.
• hardness—a linear value corresponding to the ’softness’ of shadows cast by the light.
A minimum value of 3 is considered maximum softness without artefacts and a
maximum value of 128 is considered to cast perfectly sharp shadows.
• radius—a linear value corresponding to the maximum distance that a light can illumi-
nate.
• shadowRadius—a linear value corresponding to the maximum distance that a light
can cast shadows.
• shadows—a boolean value to enable (true) or disable (false) shadow casting.
Lights are stored in a vector on the host and copied once per frame to device memory
in the same fashion as materials. The first two lights in the vector are special and are used
to calculate the ‘sky’ and ‘fog’. The light editor can add and remove lights dynamically at
run time.
has been implemented for a wide range of implicit surfaces and for the purpose of the
experiments in the present work; these are included in the accompanying code. Each SDF
takes different parameters to reflect the different modifiable characteristics for each surface
type (e.g., the orientation of a plane, radius of a sphere, angle of a cone, etc.). All SDFs
do, however, share regular characteristics: the first parameter is always the position of the
sample point in space, all return a bound for the distance field (although most are exact),
and all implicit surfaces have the origin at [0, 0, 0].
Figure 3. The iterative mechanism of raymarching to find an intersection. The blue object is the
implicit surface. The black line is the ray. The red point is the ray origin and purple point is the
detected intersection. The red circles represent the distances sampled from the distance field and the
green curves represent the ‘march forward’ each iteration.
[u v f ]T
rd0 = [rd0x rd0y rd0z ] T = p , (1)
u2 + v2 + f 2
where u, v, f are, respectively, the x coordinate of the UV, the y coordinate of the UV and
the camera focal length, and rd0 is a 3D direction vector. This process creates a ‘virtual
matrix’ of vectors spanning across all threads representing a pinhole camera with focal
length f oriented towards [0, 0, 1]. To orient the rays in the direction specified by the virtual
camera, they must be transformed using two rotation matrices. Given the camera angles φ,
θ, the following transformation gives the correctly oriented ray direction rd:
0
rd x cos θ 0 sin θ 1 0 0 rd x
rd = rdy = 0 1 0 · 0 cos φ − sin φ rd0y (2)
rdz − sin θ 0 cos θ 0 sin φ cos φ rd0z
The only other vector needed to define a camera ray is ro which is simply the 3D
vector corresponding to the camera position. From this, any position p along this ray can
be computed as p = ro + rd · t.
where tmin represents the near clipping plane. However, it is computationally impossible to
iterate infinitely, so, instead, we impose a maximum iteration depth nmax for n, a maximum
tmax limit for t (the far clipping plane), and a short circuit intersection distance hmin for h.
This results in a modified set of recurrence relations:
These formulae hold true for when all surfaces defined by map have bounded or
exact distance fields, but this is not true for all surface types; for example, a heightmap
displaced plane will often give over estimations and underestimations for the distance
field, which could potentially cause a ray to pass through high frequency displacements
in the surfaces without detecting an intersection. To prevent this from happening, it is
possible to change hn expression for the map case such that the result of the map function
is multiple by a coefficient k where 0 < k ≤ 1 . The smaller this coefficient is, the less likely
an overestimation is to happen but with the cost of requiring more marches to reach an
intersection than for a coefficient of exactly 1. The solution to this is to vary the value of
k between neighbouring pixels and between consecutive frames and then use temporal
anti-aliasing techniques to remove the resulting high frequency noise with Monte Carlo
integration [9].
However, this requires that an analytical derivative for the map can be computed,
which itself requires analytical derivatives for all SDFs. This is not necessarily possible as
some surfaces may not have an analytical derivative or have an analytical derivative which
would translate to performant device code.
Instead of obtaining the exact normal, it is possible to instead obtain an approximation,
e.g., using forward differences or central differences. However, this results in a total of six
map evaluations for the three partial derivatives, which could result in a lot of overhead
for particular complex scenes. If we assume that map( p) << 1 due to p being at an
Electronics 2021, 10, 2730 11 of 26
intersection, it is possible to obtain the normal using only three evaluations of the map
function when using the forward difference approach:
map( p + h, 0, 0)
normals( p) ≈ normalize map( p + 0, h, 0) (11)
map( p + 0, 0, h)
However, normals obtained in this way will be less numerically accurate. Instead, by
using the Tetrahedron technique, it is possible to obtain surface normals with the same
accuracy as the central difference approach using only four evaluations. This approach
samples the distance field at four equidistant vertices of a tetrahedron giving four direc-
tional derivatives that can be summed together to give the approximation of the normal at
the zero-isosurface:
!
3
normals( p) ≈ normalize ∑ ki map( p + hki ) (12)
i =0
This technique is adopted in the proposed renderer for normal evaluation. One
drawback (and of the previously discussed techniques) is that the normal value is inaccurate
at sharp intersections of multiple surfaces. One way to avoid this would be to evaluate
only the SDF of the closest object in place of the map. This can be achieved by altering the
map function to take a ‘mask’ value such that only SDFs with the material ID obtained in
the march step contribute to the distance field.
tracking t to determine occlusion, the algorithm also tracks the minimum penumbra factor
resn at each raymarch iteration. The penumbra factor is calculated as:
kh
resn = min resn−1 , , (13)
t
where res0 is initialised with a value of 1 (no occlusion), k is a coefficient for the shadow
hardness, and h, t are the surface distance and ray position coefficient as defined in the
raymarching recurrence relations. Hence, the recurrence relations for the soft shadow
casting algorithm are:
l+v
h= (17)
kl + vk
where l, v are the incoming light and view vectors, respectively. This model also, conve-
niently, ensures energy conservation such that the energy of outgoing light from a surface
cannot exceed the energy of incoming light. Phong’s model [11], probably the most com-
mon reflectance model used, fails in this respect and thus produces obviously unrealistic
shading. To ensure the conservation of energy, the diffuse shading component k d should
be multiplied by 1 − k s , where k s is the specular counterpart [10].
The reflectance equation describes the outgoing light Lo as the sum of emitted light
Le and reflected light. Reflected light is the sum of all incoming light Li multiplied by
the surface reflection f r (the BRDF) and the cosine of the incident angle. This leads to the
following expression:
Z
Lo = Le + ( f r Li )(ωi · n)dωi , (18)
Ω
where Ω is the unit hemisphere aligned with surface normal n [13]. For the surface
reflectance function f r , herein we adopted the Cook–Torrance BRDF [10], which is the
weighted sum of the Lambertian diffuse and Cook–Torrance specular terms:
where k d and k s are the weightings of the diffuse and specular terms, respectively, and
k d + k s = 1 so as to ensure the conservation of energy. The Lambertian diffuse term is
simply described as the surface albedo c normalized for integration over the hemisphere.
This gives the the Lambertian term:
c
f CookTorrance = (20)
π
The Cook–Torrance specular term is more complex. It is defined as the product of
three functions D,F,G divided by fourfold the product of the dot product between the
surface normal n and the outgoing light direction ωo with the dot product of the surface
normal and the negative of the incoming light direction ωi :
DFG
f CookTorrance = (21)
4(ωo · n)(ωi · n)
The three functions are used to approximate different parts of a surface’s reflective
properties. D, the normal distribution function, approximates the probability of surface
microfacets being aligned with the halfway vector based on the surface roughness. G, the
geometry function, describes the self-shadowing property of microfacets, where rough sur-
faces may have some microfacets that occlude light reflections caused by other microfacets.
F, the Fresnel equation, describes the ratio of surface reflections at different surface angles.
For the normal distribution function, the Trowbridge–Reitz GGX approximation is
used [14,15]. Given the surface normal n, halfway vector h and roughness parameter α:
α2
DGGXTR (n, h, α) = 2 (22)
π ( n · h )2 ( α2 − 1) + 1
The equation used is the Fresnel–Schlick approximation [17], which models specularity
as having a fifth power weighting at grazing angles (a commonly used approximation for
specularity):
The value F0 = 0.04 is the default for materials in this engine for non-metals, and is
equal to the albedo for metallic surfaces (i.e., when the metalness parameter is equal to 1).
However, custom values may be used as described in Section 2.3. The Fresnel equation is
also used for the diffuse coefficient component, where k d = 1 − F.
Combining all the above formulae:
Z
c DFG
Lo = (1 − F ) + ( Li ωi · n)dωi (27)
Ω π 4(ωo · n)(ωi · n)
Light radiance Li is calculated with the light colour ldehomogenized , the polynomial
attenuation model coefficients AConstant , A Linear , and AQuadratic , distance d, and light radius
r, which leads to:
ldehomogenized r−d
Li = (28)
AConstant + dA Linear + d2 AQuadratic r
lr l g l b
ldehomogenized = , , (29)
lw lw lw
T
where lr , l g , lb , lw is the homogeneous light colour vector. The second term in the
radiance equation is responsible for interpolating the light radiance down to zero as the
distance coefficient approaches the light radius. This allows for a smooth transition to zero
radiance as the distance of the surface to the light reaches the radius, allowing for the light
to be skipped when beyond the maximum radius.
Planar projection is suitable for planar surfaces, such as floors, walls, ceilings, etc, but
it is unsuitable for more complex surfaces such as spheres as this will cause the textures
to stretch as the angle between the surface normal and the normal of the projection plane
increases. A solution to this is to apply planar mapping in three perpendicular planes
and apply interpolation based on the surface normal alignment. This is called triplanar
mapping, also known as round cube mapping. To do this, we require both the position p
Electronics 2021, 10, 2730 15 of 26
and surface normal n. This gives a simple formula with three texture lookups weighted by
the absolute normal contribution for that axis.
|n x |sample( py , pz ) + |ny |sample( p x , pz ) + |nz |sample( p x , py )
texturetriplanar ( p, n) = (31)
|n x | + |ny | + |nz |
This function can be further parameterized with a ‘hardness’ parameter k which allows
for the contribution of each planar texture sample to be weighted more or less heavily
based on alignment. By taking the kth power of the absolute normal, the following formula
for weighted triplanar mapping emerges:
Triplanar mapping is suitable for shading since it results in a maximum of three sam-
ples per texture per camera ray. However, it is not suitable for the sampling of heightmaps
in the map function, as the texture lookups happen each iteration during marching, which
results in a much larger memory bandwidth overhead than simple biplanar mapping
which uses only one texture lookup.
c
Z
GIdi f f use = k d sky(ωi )So f tMarch( p, ωi , 3)n · ωi dωi (33)
π Ω
with a shadow hardness of k = 3 used to simulate the wide radius over which the sky is
being sampled.
Since numerically integrating fully over the hemisphere is computationally impossible,
we must instead sample the hemisphere uniformly. Given two random temporal samples
temporalx , temporaly , and the surface normal n, a random sample direction ωi is obtained
as follows:
Electronics 2021, 10, 2730 16 of 26
The more samples taken per frame, the more accurate the lighting is at the cost of
increased computation. By taking only one sample per frame, the image has a distinct
checkerboard pattern when temporal AA is disabled, but when temporal AA is enabled, an
accurate image is produced within only a few frames owing to the effective Monte Carlo
integration over the hemisphere over time. The diffuse GI calculated here is added to the
pixel colour of the first shading pass.
2.8.4. Reflections
Real-time reflections are difficult to accurately produce for any rendering method,
often resulting in needing screen space techniques to approximate the reflections. The
approach we take is to sample the skybox in a similar fashion as with the GI algorithm; by
using the Cook–Torrance specular term (instead of the Lambertian diffuse term) and and
by sampling the skybox with a specular lobe distribution (instead of a uniform uni hemi-
sphere).
We can construct an equation for the specular global illumination in a similar way to
the diffuse equation using the specular term of the Cook–Torrance BRDF and a shadow
hardness based on the surface roughness α:
Z
DFG 1
GIspecular = k s sky(ωi )So f tMarch p, wi , 3 + n · ωi dωi (40)
Ω 4( ωo · n )( ωi · n ) α
Like the diffuse GI, the specular reflections rely on temporal anti-aliasing for an
accurate result with Monte Carlo integration over time. The specular GI calculated here is
added to the pixel colour computed from the first shading pass and diffuse GI pass.
3. Evaluation
3.1. Visual Fidelity
The real-time visual fidelity achieved by this renderer is comparable to offline high
fidelity techniques such as ray tracing, as well as surpassing techniques used by real-time
raster renderers in both performance and quality. All images in this section were rendered
at 1280 × 720 on an RTX 2080 Super. All comparisons were made against Blender engine
version 2.90 and UE4 version 4.25.3.
Electronics 2021, 10, 2730 17 of 26
3.1.1. Displacements
An insightful testing example is that of displaced planes rendering at very high framer-
ates. There are three techniques for displaying detailed surfaces with microdisplacements:
geometry displacement, parallax occlusion mapping and raymarching. We recreated a
scene from the demonstration scene in Blender and UE4 for the purposes of comparison,
see Figure 4.
The raymarched render of the carpet is produced with a frame duration of less than
8 ms and a video memory footprint of 1 GB (although this is mostly due to storing the
textures needed for the full demo even though only the carpet is visible). Owing to TAA, the
produced image is fairly photorealistic with fine detail. The main advantage of this method
is the high image fidelity without computing any actual geometry while still supporting
global shadowing and self shadowing. The primary disadvantage of this technique is
requiring expensive texture lookups on each iteration.
The image produced by ray tracing with Cycles is also photorealistic with outstanding
detail, although it took a whole 9 s with Nvidia OptiX acceleration to render with a peak
video memory footprint of 3 GB, which was mostly due to needing to store the data for
all vertices, which is the main disadvantage of this technique. Despite this, the image
produced is remarkably accurate, including the multibounce shadowing interactions within
the crevices of the carpet. The image produced with parallax occlusion mapping in UE4
is also as photorealistic and as performant as the raymarched image, taking under 9 ms
to produce the image with a video memory footprint of 400 MB. Like raymarching, no
actual geometry is computed, making it very lightweight. However, this technique comes
with significant disadvantages such as being unable to produce dynamic self shadows and
being unable to cast global shadows onto other objects of the scene. Another drawback is
the visible ‘stair-stepping’ of the surface when viewed up close and at sheer angles, where
the effect breaks down completely. This system also requires expensive texture lookups
like the raymarching solution.
From this example, it is clearly evident that raymarching is on par with parallax
occlusion mapping with regard to resource usage and is arguably of a better quality with
no visible artefacts when Temporal Anti-Aliasing is enabled for the raymarcher.
3.1.2. Fractals
Another revealing test case is the rendering of fractals, see Figure 5. The main free
renderer used for fractals is Mandelbulb3D (MB3D), a very cumbersome to use program
which only supports CPU based rendering and had remained, up until 2015, when it was
taken over by a new maintainer, closed source. Since the program takes no advantage of
hardware acceleration, it often takes several seconds to produce an image.
MB3D lacks many features present in our renderer, such as texture support, support
for other implicit surface types, HDR lighting, PBR shading, etc. MB3D works using a
similar technique called fixed step ray marching, where a binary search is performed with
fixed sample positions along a ray to determine the intersection surface.
Overall, MB3D is a poor candidate for the real-time rendering of fractals when com-
pared to our raymarching renderer that is able to render fractals in real time in under 10 ms
per frame in addition to supporting reflections, shadows, texturing and anti-aliasing.
Electronics 2021, 10, 2730 18 of 26
Figure 4. Raymarched displacement plane (top). Raytraced displacement plane rendered with
Blender Cycles (middle). Rasterized parallax occlusion mapping plane in UE4 (bottom).
Electronics 2021, 10, 2730 19 of 26
Figure 5. A power 4 Mandelbulb rendered in Mandelbulb3D (top) and our raymarcher (bottom).
3.2. Lighting
Shadow casting is an important feature of a realistic renderer. Regardless of the
technique used, it is costly to compute. We recreated a scene which includes multiple
spheres with different materials and 64 point lights all casting shadows to compare the
performance of our raymarcher and a commercially available engine, see Figures 6 and 7.
Both engines run at about 30 frames per second, giving comparable performance;
however, in UE4, the point lights are unable to cast shadows correctly on the floor plane due
to the drawbacks of parallax occlusion mapping. Both renderers produce similarly realistic
lights with specular reflections as well as shadow reflections which are both achieved
through temporal anti-aliasing. When the point light shadows are disabled, both engines
run at much higher frame rates at an upwards of 90+ frames per second.
Another important part of rendering realistic lighting is the ability to cast soft shadows.
We recreated another scene from the demo in UE4 and tried to match the lighting conditions
as close as possible.
It is clear to see that UE4 struggles to cast soft shadows in such a simple scene due to
the nature of rasterized shadow maps not taking into account shadow penumbras. Another
interesting phenomenon to note is that the sphere geometry is visible in UE4 in the shadows
due to meshes being comprised of triangles as opposed to raymarching where surfaces
have exact mathematical representations.
Electronics 2021, 10, 2730 20 of 26
It should also be noted that UE4 is running with frame durations of around 8 ms while
our raymarching renderer runs with around 6 ms frame durations, meaning the Monte
Carlo integration via TAA converges on realistic light conditions 20% faster in the proposed
renderer than UE4 for this scene.
Figure 6. Real-time shadows in the raymarching renderer (top) and UE4 (bottom).
Electronics 2021, 10, 2730 21 of 26
Figure 7. A comparison of soft shadows between our renderer (top) and UE4 (bottom).
Figure 8. A comparison of the performance difference between bounding volume optimisation being
enabled (right) and disabled (left). The views shown are the final pass (top), kernel clock cycles
(middle), and SDF evaluation count per pixel (bottom).
A similar phenomenon can be observed in the SDF evaluation views where the outlines
of the bounding volumes are visible around the rows of primitives when enabled. It is also
evident that fewer evaluations occur in areas of shadow due to a decreased number of map
evaluations upon shadow rays converging on the casting surface.
Although a GPU is often considered SIMD, it is actually more accurate to describe
modern GPUs as MIMD since different Streaming Multiprocessors (SMs) can process
different instruction streams simultaneously over one or more thread warps per SM.
Combined with the fact that the GPU is saturated with threads for this workload (i.e., more
threads are scheduled than slots available to simultaneously compute them), the GPU
can finish computation of the cheaper parts of the frame early and distribute the more
expensive threads across the available SMs.
Lastly, as a means of providing additional insight into the performance of our engine,
using an example scene shown in Figure 9, in Figures 10–14, we provide a series of
renderings, some of which illustrate the results of different intermediate computational
stages of a rending and others which quantify in an easily comprehensible manner the
associated computational burden.
Electronics 2021, 10, 2730 23 of 26
References
1. Hart, J.C.; Sandin, D.J.; Kauffman, L.H. Ray tracing deterministic 3D fractals. In Proceedings of the 16th Annual Conference on
Computer Graphics and Interactive Techniques, Boston, MA, USA, 31 July–4 August 1989; pp. 289–296.
2. Hart, J.C. Sphere tracing: A geometric method for the antialiased ray tracing of implicit surfaces. Vis. Comput. 1996, 12, 527–545.
[CrossRef]
3. Quilez, I. Rendering Worlds with Two Triangles with Ray Tracing on the GPU. 2008. Available online: https://www.iquilezles.
org/www/material/nvscene2008/rwwtt.pdf (accessed on 10 June 2021).
4. Quilez, I. Making a Simple Apple with Maths. 2011. Available online: https://www.youtube.com/watch?v=CHmneY8ry84/
(accessed on 10 June 2021).
5. Quilez, I. Inigo Quilez: Articles. Available online: https://www.iquilezles.org/www/index.htm/ (accessed on 10 June 2021).
6. Granskog, J. CUDA ray MARCHING. 2017. Available online: http://granskog.xyz/blog/2017/1/11/cuda-ray-marching/
(accessed on 10 June 2021).
7. Keeter, M.J. Massively parallel rendering of complex closed-form implicit surfaces. ACM Trans. Graph. 2020, 39, 4. [CrossRef]
8. Mallett, I.; Seiler, L.; Yuksel, C. Patch Textures: Hardware Support for Mesh Colors. IEEE Trans. Vis. Comput. Graph. 2020, in press.
[CrossRef] [PubMed]
9. Jensen, H.W.; Christensen, N.J. Photon maps in bidirectional Monte Carlo ray tracing of complex objects. Comput. Graph. 1995,
19, 215–224. [CrossRef]
10. Cook, R.L.; Torrance, K.E. A reflectance model for computer graphics. ACM Trans. Graph. 1982, 1, 7–24. [CrossRef]
11. Lafortune, E.P.; Willems, Y.D. Using the Modified Phong Reflectance Model for Physically Based Rendering; Report CW 197; KU Leuven:
Leuven, Belgium, 1994.
12. Blinn, J.F. Models of light reflection for computer synthesized pictures. In Proceedings of the 4th Annual Conference on Computer
Graphics and Interactive Techniques, San Jose, CA, USA, 20–22 July 1977; pp. 192–198.
13. Nicodemus, F.E. Directional reflectance and emissivity of an opaque surface. Appl. Opt. 1965, 4, 767–775. [CrossRef]
14. Trowbridge, T.; Reitz, K.P. Average irregularity representation of a rough surface for ray reflection. JOSA 1975, 65, 531–536.
[CrossRef]
Electronics 2021, 10, 2730 26 of 26
15. Walter, B.; Marschner, S.R.; Li, H.; Torrance, K.E. Microfacet Models for Refraction through Rough Surfaces. In Proceedings of the
18th Eurographics Conference on Rendering Techniques, Grenoble, France, 25–27 June 2007; pp. 195–206.
16. Karis, B.; Games, E. Real Shading in Unreal Engine 4. 2013. Available online: https://www.bibsonomy.org/bibtex/203641889131
c93632e2790ab7d25aa9d/ledood (accessed on 31 October 2021).
17. Schlick, C. An inexpensive BRDF model for physically-based rendering. In Computer Graphics Forum; Wiley: Hoboken, NJ, USA,
1994; Volume 13, pp. 233–246.