0% found this document useful (0 votes)
28 views6 pages

Performance Analysis of RTX Architecture in

Uploaded by

COMUT PUC Minas
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
28 views6 pages

Performance Analysis of RTX Architecture in

Uploaded by

COMUT PUC Minas
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 6

2022 IEEE 42nd International Conference on Distributed Computing Systems Workshops (ICDCSW)

Performance Analysis of RTX Architecture in


Virtual Production and Graphics Processing
2022 IEEE 42nd International Conference on Distributed Computing Systems Workshops (ICDCSW) | 978-1-6654-8879-2/22/$31.00 ©2022 IEEE | DOI: 10.1109/ICDCSW56584.2022.00048

Tony Oakden Manolya Kavakli


AIE Institute, AIE Institute
Canberra, Australia Sydney, Australia
[email protected] [email protected]
ORCHID 0000-0003-3241-6839

Abstract— Real-time rendering techniques developed for Ray Tracing (RT) has been used for CGI movies for many
computer games combined with the improved algorithms and years but it is only recently that its application in real-time has
advanced hardware such as the Nvidia Geforce RTX 3000 series become practical. This is partly due to improved software, but
of graphic cards improve the quality of the rendered images in advances in hardware, such as the Nvidia Geforce RTX 3000
CGI. In this paper, our goal is to test the performance of RTX series of graphic cards, have provided hardware support for
architecture in Virtual Production and graphics processing. We real-time lighting and improved the quality of the rendered
conducted a series of tests for rendering of a scene in Unreal scenes in CGI when compared to traditional real-time
game engine in a Virtual Production studio. Images are rendering techniques [4]. This technology has also simplified
rendered in 4K and output to a network distribution system
the processes required to light a real-time scene. In traditional
where the image is broken down into a series of smaller images
each rendered onto LED screens. The comparison of render
rasterized images, lightmaps and light probes must be created
times between two graphics workstations using Nvidia RTX and placed by artists. With VP, RT artists are instead free to
A6000 GPU and Nvidia RTX A3090 GPU show that whilst RTX focus on lighting photorealistic scenes in much the same way
architecture produces better image quality, the gains might not as lighting engineers in a traditional film production.
be worth the additional hardware cost required by the high-end
graphic cards. It might also be optimal to split the rendering of III. RTX ARCHITECTURE
the scene across multiple computers. The graphics cards powered by Ampere, NVIDIA’s 2nd
generation RTX architecture [4], with new ray tracing [RT]
Keywords— Real-time rendering, Graphics Processing, cores, Tensor Cores, and streaming multiprocessors for the
Virtual production, Game engineering most realistic ray-traced graphics and cutting-edge AI
features, promise to revolutionize real-time rendering over the
I. INTRODUCTION coming decade. It is expected that RTX architecture [5] will
Virtual Production (VP) integrates virtual and augmented largely replace the current rendering techniques for all mid to
reality technologies with CGI and VFX using a game engine high range hardware applications. It is likely to become the
to enable on set film production crews to capture and unwrap standard to light the scenes, particularly relevant to virtual
scenes in real time [1]. The use of real-time rendering production applications [6] where real-time images are
techniques, developed for computer games, offers great essential for believable integration of live actors and CGI.
opportunities in film production. As stated in [2], until
recently, most film production has been done at 2K (2048 RTX architecture (The NVIDIA Turing™ architecture)
pixels wide). As more 4K (4096 pixels wide) digital projectors combined with GeForce RTX™ platform fuses together real-
become available, there has been a push to use 4K throughout time Ray Tracing (RT), AI, and programmable shading. It
the virtual production process. While this technically allows for significantly more transistors to be packed into a
increases the image quality, some issues still need to be smaller area, using the 2nd generation of consumer RT and
addressed. third generation deep learning hardware. “This dedicated RT
hardware can cast upwards of 10 gigarays per second,
This paper focuses on hardware used for virtual allowing real-time, movie-like lighting in games. Real-time
production. In this paper, our goal is to test the performance ray tracing is only possible because RTX graphics cards
of RTX architecture in Virtual Production (VP) and graphics deliver up to 6x faster ray-tracing performance… RTX
processing. In the remaining sections we first review graphics graphics cards also offer Tensor Cores capable of delivering
processing and RTX architecture, and conduct a performance over 100 teraflops of AI processing to accelerate gaming
analysis on RTX architecture. Then, we discuss the test performance with NVIDIA DLSS [25].” However, there are
results, and evaluate the performance of two graphics cards some problems with the process for which there are currently
used in virtual production. no good solutions. One of these is that the system requires
placement of light probes in the scene which are used in real-
II. ADVANTAGES OF VP time by the GPU hardware for quantization of the scene.
Modern rendering systems can produce
While only RTX cards have RT and Tensor Cores, the
photorealistic scenes in real-time and those can be projected
GTX 16-Series also uses the shared Turing architecture to
onto screens behind the actors. With VP the actors are placed
offer high performance. The GTX cards excel at both
in front of a live screen displaying exactly what will appear on
immersive gaming, providing great experiences in popular
the final film. Thus, directors and cinematographers are able
titles [17] such as Fortnite, Overwatch and Counterstrike:
to better assess the quality of shots in real-time and adjust the
Global Offensive and Wolfenstein II: The New Colossus by
final output accordingly. This significantly reduces
making use of Turing’s concurrent floating point and integer
production time and improves the quality. As discussed in [1]
operation, and advanced shading, including variable rate
and [3], VP offers significant advantages over the traditional
green screen production methods.

2332-5666/22/$31.00 ©2022 IEEE 215


DOI 10.1109/ICDCSW56584.2022.00048
Authorized licensed use limited to: PUC MG. Downloaded on July 04,2023 at 18:11:23 UTC from IEEE Xplore. Restrictions apply.
shading to deliver higher frame rates with lower power it. The technology does offer advantages for offline rendering
consumption for great performance and value [25]. tasks and other GPU tasks. Nvidia only provides support for
NVlink in the higher performance cards such as the RTX
IV. PERFORMANCE ANALYSIS ON RTX 3090 and 6000 series.
ARCHITECTURE
We conducted a series of tests for rendering of a VP scene D. Screen Configuration
produced using a computer running the Unreal game engine In this section, we briefly discuss the hardware
to compare the performance of two machines. Images are configuration for the simple case of using a single computer
rendered in 4K and output to a network distribution system to render all the graphics. In the testing studio, there are three
where the image is broken down into a series of smaller LED walls as follows: One large back wall, which contains
images each rendered onto LED screens which are made up the image that will appear behind the actors, and two smaller
from a number of subpanels. Unreal engine provides many walls, typically behind the camera, used to provide the fill in
options to allow for very high-quality scenes to be rendered. lighting. All three walls are made up of from the same sized
Two machines have two different sets of graphics cards. Both 500mm square LED panels which are 192X192pixel
cards have very similar architecture. Both machines run resolution. Note that a 4K image can be split over the three
Windows 10 with Unreal 4. The 3D scene is constructed in walls with area to spare. Table I describes the configuration
Unreal and rendered using a number of separate virtual of the main screen which appears behind the actors, and
cameras. These are configured to produce output in three Table II describes the two smaller screens.
different places on the same 4K screen. In the usual
configuration machine 1 is used for rendering all the graphics TABLE I. THE MAIN LED WALL CONFIGURATION
for the scene and the 2nd computer not used. It is possible to Dimension #panels Pixels/panel Total Size Pitch
split rendering load between multiple machines, but we
pixels (mm) (mm)
typically do not need to do this.
X 16 192 3072 8000 2.6
A. Machine1
Y 6 192 1152 3000 2.6
The machine1 has two Nvidia RTX A6000 GPU cards [7]
connected using Nvidia proprietary NVlink bridge [8] which TABLE II.THE CONFIGURATION OF THE TWO SMALLER LED WALLS
allows the cards to share the rendering load. This is very
much a high-end card and very rarely found in home gaming Dimension #panels Pixels/panel Total Size Pitch
rigs, even less so with a NVLink configuration. The pixels (mm) (mm)
machine1 uses high end components such as: X 5 192 960 2500 2.6
AMD Ryzen 9 5950X 16 Core Processor Y 4 192 768 2000 2.6
64GB Ram
Nvidia RTX A6000 48GB Video Card The output from the machine is split over multiple displays
Total Machine Price ~ $21,000 AUD using a piece of hardware made by Brompton Technology:
Each RTX6000X has the following components: Tessera SX40 [9]. The output further feeds into Brompton
48 Gigabytes of GDDR6 memory Tessera XD, distribution unit before the ethernet cable daisy
10,752 Ampere architecture based CUDA cores chains over all the panels in the walls. Even though there are
84 RT (Ray tracing cores) only three screens, the scene is actually rendered multiple
336 Tensor Cores times in Unreal to four different view ports. This is because
Memory bandwidth 768 GB/s the image which appears in the camera frustum is required to
B. Machine2 move according to the cameras position in the scene.
However, the position of objects casting reflections or
The machine2 has one Nvidia RTX A3090 GPU: lighting objects, should not move in response to the camera.
AMD Ryzen 9 5900X 12 Core Processor
Fig1 shows a screen grab from unreal showing the plan
64GB Ram
layout of a typical scene. The curved yellow line on the right
Nvidia RTX A3090 24GB Video Card
is the main LED screen. The two smaller lines on the right
Total Machine Price ~ $7,000 AUD are the fill in screens used for lighting. The white sphere in
Each RTX3090 has the following components: the center is the probe used to position the cameras to take the
24 Gigabytes of GDDR6X memory environmental background image, the yellow camera is the
10,496 Ampere architecture based CUDA cores camera which is synchronized to the physical camera using
the Vive Puck (VIVE Tracker that brings any real-world
82 RT (Ray tracing cores)
object into your virtual world), the other lines are scene
328 Tensor Cores Base Clock 1.4 GHz geometry used in this demo. Fig 2 is the scene as seen in the
C. NVLink unreal editor to give a better indication of how it fits together,
NVLink is Nvidia’s proprietary solution to enable showing only one of the fill in LED walls. The position of the
graphics processing tasks to be split between multiple GPU camera is synchronized to the position of the physical camera
cards installed on the same motherboard, as the successor to in the scene. Unreal then renders the scene from that point of
SLI. The technology was designed to work with real-time and view and superimposes the image over the static camera
offline rendering tasks, but the gaming community has not image for the main wall in such a way that the image appears
adopted this technology and few commercial games support in the correct position behind the actors.

216

Authorized licensed use limited to: PUC MG. Downloaded on July 04,2023 at 18:11:23 UTC from IEEE Xplore. Restrictions apply.
2) Rasterization of the polygons in the scene geometry
3) The coloring of individual pixels which make up the
polygons once projected to the render surface.
Even though modern scenes can contain millions of points
which need transforming, it is the last stage in the pipeline
which typically requires the most significant GPU resources.
Each pixel needs to be colored and the calculations required
to light it can be a very significant cost.
V. TESTING
Our goal in testing was to conduct a series of experiments
Fig. 1. Plan view of example scene to determine the relative effectiveness of various hardware
and software configurations in terms of differences in frame
rate when switching between RTX and traditional raster
lighting. Data has been collected from the above hardware
configuration using three different test scenes and a variety
of different configurations. Several scenes were created in
Unreal for testing, based on the scene shown with various
modifications.
A. Complex Scene with complex lighting (Fig 3)
B. Complex Scene Simple lighting
C. Simple Scene
D. Overlapped transparency
E. Particle effects
Fig. 2. Screen grab of example scene in the Unreal editor window.

E. Resolution and response time


One of the biggest problems in VP is input lag and
response time as discussed by [11]. If the camera is
positioned 4 meters from the screen and uses a 50mm lens,
frustum width on screen is ~3.4 meters which equates to a
1300 pixel. At this resolution screen pixels would be
noticeable, if they were in focus, moiré patterns will be
visible through the camera lens. This is solved by means of
the camera lens depth of field. The LED wall is slightly out
of focus because the actors are in focus in the foreground.
Depth of field can also be simulated in the Unreal engine, so
that objects in the distance are out of focus. Fig. 3. Render of the complex scene
F. Simultaneous use of machines
Table III summarizes mesh count and triangular count, and
Whilst it is possible to split the rendering of the scene Table IV lighting configuration for the test scenes.
between multiple computers, in practical terms this is
problematic as it tends to lead to visible seems in the image. TABLE III. SCENE COMPLEXITY FOR THE TEST SCENES
A better use of the technology is to use one computer to
render to the main screen and another to render the back Scene Mesh Count Triangle Count
screens, and perform rendering using RTX for the main A 1,878 49,556,769
screen and raster for the back screens. In this configuration, B 1,878 49,556,769
all computers need to be synchronized and Unreal provides
C 730 21,607,888
nDisplay, a tool which allows multiple computers to render
the same scene and stay synchronized [12]. We have not used D 730 21,607,888
this in our tests.
TABLE IV. THE LIGHTS
G. Lighting Calculations
Scene Dynamic Lights Stationary Static
An understanding of the techniques used for lighting
calculations for the scene is required to interpret the shadows yes no
performance data presented. Therefore, in this section, we A 1 15 9 6
review the differences between rasterization (which is the
B 0 0 2 3
traditional way to light a Realtime 3D scene) and real time
RT (RTX as supported by Nvidia cards). In general, the C 0 0 2 3
render pipeline is split into three mains stages: D 0 0 2 3
1) Transformation of vertices from one coordinate system Additional Translucency Assets:
to another (e.g. model space into clip space) • Mesh Count: 14 Meshes

217

Authorized licensed use limited to: PUC MG. Downloaded on July 04,2023 at 18:11:23 UTC from IEEE Xplore. Restrictions apply.
• Particle Count: 13,200 Particles the NVlink, when only one RTX A6000 is running,
For the transparent scenes an additional option was performance is worse on machine1, than it is on machine2.
explored to test with real-time raytracing on only the This is a surprise because machine2 is inferior to machine1
transparency and on both the particles and transparency. in every way. Again, performance comparisons of the Nvidia
The principle hardware configurations tested where: RTX A3090 and the Nvidia RTX A6000 suggest that the
• Machine 1 – NVLink enabled former is superior for gaming purposes and because we are
• Machine 1 – NVLink Disabled using Unreal as the tool for rendering the scenes, which is
• Machine 2 primarily aimed at the games market, it may be heavily
For every combination above tests were performed with the optimized for the Nvidia RTX A3090.
following lighting:
• Real time ray tracing Comparison of display update (average FPS)
• Rasterization
The tests are run without the physical camera being used 60.00
but some movement is simulated using a script which simply 50.00
rotates the camera through 360 degrees and moves it along a 40.00
track. This provides some degree of averaging across 30.00
different scene complexity. 20.00
10.00
A. Data Analysis 0.00
Data was gathered using the UnrealInsights tool. “Unreal Ray trace Ray trace Ray Raster Raster Raster
Insights helps developers identify bottlenecks, which is simple Trace simple simple
lights simple lights scene
useful when optimizing for performance [16].” The tool runs scene
in parallel whilst the game engine is running and captures
information in real-time concerning every frame rendered. A6000 SLI A6000 3090
Our process was to load the scene, start the profiler and then
run the script to simulate the camera movement. The data Fig. 4. Comparison of display update (Average FPS)
from UnrealInsight was then saved out as a file which can be
reloaded again later for analysis. The tool is a great aid when Fig 5 shows the Timing Window and how tasks are
attempting to analyse a scene if looking for ways to improve distributed between the various processing cores and threads.
the scene performance. The metric which we chose to Analysis of the timing window from Unreal Insight in Fig 6
compare performance was the average time taken to render sheds some light on why NVlink is not providing a significant
each frame. In order to calculate the average frame rate we improvement in performance, as it shows that the two GPU
simply divide the time the test took to run (which is available are rarely fully occupied.
in the session summary tab) by the total number of frames
rendered. We also note the maximum startup frame time,
maximum frame time after startup and the minimum frame
time. This process was completed for all the test data we
collected and collated into an excel spreadsheet. Data was
split between the non-transparent and transparent screens.
Session data was first grouped by the hardware configuration
and then, within those groups sub groups were created for
raster or ray traced, then again by the scene complexity.
Results were plotted to show the comparison of render
times in similar hardware and software configurations across
the six nontransparent scenes. Output from a typical session
is shown in Fig 4. The vertical axis shows frames rendered
per second, the higher the frame rate the smoother any visible Fig. 5. Typical Output from Unreal Insight
animation on the screens will be and the faster the scene will
update if the camera is being moved. Results are grouped into
those for the same scene with the three hardware
configurations next to one another for comparison. On all
graphs the 2 RTX 6000 with NVLink are first, followed by
the single RTX6000 and finally the RTX3090. The first
group is the most complex scene and the last is the simplest Fig. 6. Typical Output from the timing window (when Ray tracing enabled)
scene.
For a typical set of frames GPU1 runs at approximately
VI. RESULTS 80% capacity, but GPU2 is only occupied for 50% of the
time. Further analysis shows that GPU2 is running tasks after
As might be expected RT is the slowest to render and the
GPU starts generating the next frame and the next frame
simple scene with simple lighting the fastest. However, there
cannot be started until GPU2 has completed its tasks. This in
are some interesting aspects to note here. First of all, whilst
turn delays GPU1 from starting the next frame. This sort of
the two Nvidia RTX A6000 running on the NVLink is the
scheduling problem of one task waiting for another to
fastest hardware, it is not significantly so. Secondly, without

218

Authorized licensed use limited to: PUC MG. Downloaded on July 04,2023 at 18:11:23 UTC from IEEE Xplore. Restrictions apply.
complete the task is very common in multi-threaded systems Nvidia have provided some information as to how their RTX
and would require careful scheduling to fully occupy both system handles ray tracing which provides some insight into
GPU threads. For the single processor (both RTX A6000 and why this might be. For opaque meshes a flag should be set
RTX A3090) we found that the GPU1 was almost fully which ensures that a closest-hit shader is called once when a
occupied all the time. Hence, for the NVlink configuration ray strikes a polygon. For non-opaque meshes an any-hit
we typically see is an approximate improvement of only shader must be called and the ray may continue after
130% over the single GPU configuration and even then, intersecting the geometry, this is clearly slower than non-
performance will further be lost through the additional opaque meshes [14], because multiple ray-to-geometry
scheduling work that needs to be done. collisions will likely need to be calculated for the same ray
Our observations with regards to the performance of and the any-hit shader itself is more complex, as it has to
NVLink closely follow those found in an online article correctly calculate the layers of transparency and possibly
published on the [8]. In this article the author compares the spawn additional rays depending on the lighting technique
performance of systems using a pair of Nvidia RTX 2080Ti used.
and RTX 2080 with and without NVLink enabled. For their For rasterization there are several different techniques to
experiments they have used a set of graphics cards to ray trace handle nonopaque geometries [15], which one is used is
5 scenes of varying complexity using VRay [13]. Their engine dependent. Transparency presents many problems for
experiments compare render time in seconds. Scenes are modern rendering hardware and render pipelines. It appears
substantially more complex than ours with render times up to that RTX is no exception.
500 seconds to generate a 4K image. In most cases the
systems with NVlink perform slightly worse than the systems VII. EVALUATION
without it. As to why the Nvidia RTX A3090 is almost as fast as the
significantly more expensive Nvidia RTX A6000, it is likely
A. Transparency
that Unreal is optimized more towards this card, since it is
We created a scene with many transparent overlapping predominantly aimed at the gaming market and the Nvidia
layers and another with overlapping layers and particles. The RTX A6000 is aimed more at the professional graphics
intention was to obtain a quick estimate of the cost of market. It is unlikely that many gamers would consider
transparency with the available hardware. Fig. 7 purchasing a Nvidia RTX 6000, the Nvidia RTX A3090 is
demonstrates average frame rate per second for each scene in currently considered state of the art. Finally, we note that Ray
terms of rendering Transparency. As with the previous Tracing, even on the top end GTX cards, is significantly
graphs, results are grouped by scene and render complexity. slower in all configurations that the equivalent scenes
rendered with raster. The simple scene with simple light (one
Average FPS light) takes almost as long to render as the most complex
scene using rasterization. This indicates that although the
60.00 hardware is extremely capable the workload involved in real-
40.00 time ray tracing is still very high. This may be acceptable if
20.00 the quality of rendered output is substantially better.
0.00 However, informal tests indicate that in a side by side
comparison most users struggled to tell the difference
between real-time ray traced images and rasterized ones. Of
course, this is at least in part due to the effort put into making
A6000 A6000 no SLI 3090 the rasterized scene look acceptable (baking lights, light
probes etc.) but still, it is an issue which needs to be
Fig. 7. Transparency Average FPS considered.
RTX6000 has promised that designers and artists could
In Fig. 5, the first group shows the relative performance of have the power of hardware-accelerated ray tracing, deep
the three hardware configurations with transparency rendered learning, and advanced shading to dramatically boost their
using RT. The second shows relative performance with productivity and create content faster than ever before.
particles and transparency and RT. The next two show the However, industry reports, as seen in [17] and [18] have
same scenes rendered using rasterization. The final three are published disappointing outcomes, while Nvidia GeForce
a combination of raster for transparency and ray tracing for RTX 3090 was rated at 100% in terms of performance [19].
particles. As before, there is not much advantage between the Although we were unable to locate one to one testing of these
two graphic cards for the ray traced scene. The top end cards, there are close enough comparisons between
hardware is the fastest, but only by about 20%. Nvidia RTX RTX2080Ti and RTX6000. A benchmarking report [18]
A3090 outperforms the single Nvidia RTX A6000. With the states that RTX2080Ti for example, is best suited for small-
particles included the twin Nvidia RTX A6000 perform about scale model development, rather than full-scale training
the same, but the other two machines perform somewhat workloads. On the other hand, the RTX 6000, while at a
worse. The introduction of particles has not significantly significantly higher cost than the RTX2080Ti, has the
affected the frame rate, possibly the complexity of the scenes benefits of both the 2080Ti's blower design and the TITAN
was not sufficient to stress the hardware, although visually RTX's large memory capacity. The blower design allows for
they were quite obvious. workstations to be configured with up to 4 in a single
The scene which uses transparency is significantly faster workstation. The Nvidia RTX A6000 can be densely
to render using Raster techniques than RT, over 300% in fact. populated in a system, whilst boasting large memory capacity

219

Authorized licensed use limited to: PUC MG. Downloaded on July 04,2023 at 18:11:23 UTC from IEEE Xplore. Restrictions apply.
for large models. This makes it preferable in deep learning REFERENCES
tasks for computer vision. [1] Manolya Kavakli, and Cinzia Cremona, C., 2022: The Virtual
Is the NVIDIA Quadro RTX 6000 good value for Production Studio Concept – A Game Changer in Filmmaking, IEEE
money? This question has been currently explored in the VR 2022: the 29th IEEE Conference on Virtual Reality and 3D User
Interfaces, 12-16 March, 2022, Virtual, p.1-10
industry. The architecture leveraging DirectX Raytracing
[2] Jeffrey A. Okun and Susan Zwerman, 2021. The VES Handbook of
(DXR), OptiX (a ray tracing API), and Vulkan (a cross- Visual Effects, 3rd Edition, Taylor and Francis, London
platform API, open standard for 3D graphics and computing [3] Tony Oakden and Kavakli, Manolya , 2022: Graphics Processing in
that targets high-performance real-time 3D graphics Virtual Production, 14th International Conference on Computer and
applications, such as video games and interactive media) was Automation Engineering (ICCAE 2022) March 25-27, 2022 Brisbane,
Australia, 1-6
first introduced in August 2018 at SIGGRAPH 2018 in the
[4] NVIDIA, 2021a. GEFORCE RTX 3080 FAMILY THE ULTIMATE
workstation-oriented Quadro RTX cards [16], and later at PLAY, last accessed on 29 Oct 2021, https://www.nvidia.com/en-
Gamescom [21]. As seen in [22], earlier Turing generation au/geforce/graphics-cards/30-series/rtx-3080-3080ti/
has experienced some glitches in addition to high prices, poor [5] NVIDIA, 2021b. NVIDIA RTX™ platform, last accessed on 29 Oct
availability and raytracing at a low level. Many users such as 2021, https://developer.nvidia.com/rtx
[23] and [24] reported failures of RTX 2080Ti. However, we [6] Unreal, 2021. Storytelling reimagined, last accessed on 29 Oct 2021,
have not observed such glitches in RTX3090 and RTX6000. https://www.unrealengine.com/en-US/solutions/film-television
[26] also compared the NVIDIA Quadro RTX 6000 with the [7] a6000-datasheet-us-nvidia, 2022. Retrieved from Nvidia.com:
https://www.nvidia.com/content/dam/en-zz/Solutions/design-
most popular Graphics Cards. NVIDIA Quadro RTX 6000 visualization/quadro-product-literature/proviz-print-nvidia-rtx-a6000-
scores poorly in their evaluation, barely achieving 69% datasheet-us-nvidia-1454980-r9-web%20(1).pdf
performance compared to AMD Radeon 6900 XT, and [8] Branko Gapo, 2021. NVLink vs. SLI and Multiple GPUs – Is it worth
NVIDIA GeForce RTX 3090 which are cost effective it? Retrieved from CGdirector: https://www.gpumag.com/nvlink-sli-
alternatives. Our findings are aligned with these studies. difference/
[9] Tessera SX40, 2022. https://www.bromptontech.com/product/sx40/
VIII. CONCLUSION [10] Brompton Tessera XD, 2022.
https://www.bromptontech.com/product/xd/S.
In this paper, we conducted experiments to test various [11] Manolya Kavakli, 2022: Requirements for reducing the input lag in a
hardware configurations using a variety of different scenes. Virtual Production Studio, HCI INTERNATIONAL 2022, 24TH
More specifically, we compared the consumer-oriented and INTERNATIONAL CONFERENCE ON HUMAN-COMPUTER
product-oriented graphics cards such as NVIDIA Quadro INTERACTION, 26 June - 1 July 2022, Gothenburg, Sweden
RTX 6000 with GeForce RTX3090. The results were collated [12] Sevan Dalkian, 2019. “nDisplay Technology Whitepaper,” Epic
Games, https://cdn2.unrealengine.com/Unreal+Engine%2Fndisplay-
and presented graphically. Our high-end hardware did not whitepaper-final-updates%2FnDisplay_Whitepaper_FINAL-
offer the performance gains we had expected over the lower f87f7ae569861e42d965e4bffd1ee412ab49b238.pdf.
end hardware and further research is needed to understand [13] Chaos, 2022. Retrieved from https://www.chaos.com/
why. RTX was shown to be significantly slower, even on the [14] Juha Sjoholm, 2018. Effectively Integrating RTX Ray Tracing into a
top end hardware, than rasterization. Some additional Real-Time Rendering Engine. Retrieved from nvidia developer Blog:
experiments were carried out to test the effects of https://developer.nvidia.com/blog/effectively-integrating-rtx-ray-
tracing-real-time-rendering-engine/
transparency in the scene and this had a significant
[15] Alex Dunn and Louis Bavoil, 2014. Transparency (or Translucency)
detrimental effect on performance but noticeably worse when Rendering. Retrieved from Nvidia Developer Blog:
rendering using RTX rather than rasterization. https://developer.nvidia.com/content/transparency-or-translucency-
In summary, we conclude that whilst RTX produces better rendering
image quality, the gains might not be worth the additional [16] Epic. 2022. nDisplay Overview. Retrieved from
https://docs.unrealengine.com/:
hardware cost required. It might also be optimal to split the https://docs.unrealengine.com/4.27/en-
rendering of the scene across multiple computers where the US/WorkingWithMedia/IntegratingMedia/nDisplay/Overview/
images destined for the displays which do not appear in the [17] https://benchmarks.ul.com/hardware/gpu/NVIDIA+Quadro+RTX+60
frustum are rendered by lower spec computers not using 00+review
rasterization and at a lower resolution, but the main screen is [18] https://www.exxactcorp.com/blog/Benchmarks/deep-learning-
rendered by higher end hardware. benchmarks-comparison-2019-rtx-2080-ti-vs-titan-rtx-vs-rtx-6000-vs-
rtx-8000-selecting-the-right-gpu-for-your-needs
In future, first, we intend to create a series of more complex
[19] GPU Benchmarks Ranking
scenes to adequately stress the hardware used in the high-end (https://www.tomshardware.com/reviews/gpu-
machine and run similar tests to the ones in this paper. hierarchy,4388.html)
Second, we plan to test hardware configuration by physically [20] https://www.anandtech.com/show/13282/nvidia-turing-architecture-
switching the cards over and run the tests on the same deep-dive/5
computer to eliminate CPU issues. Third, we plan to assess [21] "NVIDIA TURING GPU ARCHITECTURE: Graphics Reinvented"
(PDF). Nvidia. 2018. Retrieved June 28, 2019.
the relative quality of the ray traced and rasterised versions of
[22] https://www.pcbuildersclub.com/en/2018/11/faulty-rtx-2080-ti-
the scenes, using experimental subjects to view the output and nvidia-switches-from-micron-to-samsung-for-gddr6-memory/
to rate the quality. [23] Florian Maislinger [2018]. "Faulty RTX 2080 Ti: Nvidia switches from
Micron to Samsung for GDDR6 memory". PC Builder's Club.
ACKNOWLEDGMENT Retrieved July 15, 2019.
We are grateful to Andy Marriott, Managing Director of [24] https://www.igorslab.de/
Silver Sun Pictures for giving us permission to conduct [25] https://blogs.nvidia.com/blog/2019/11/01/whats-the-difference-
testing in their VP Studio, and Lachlan Emanuel for running between-nvidia-rtx-and-gtx/
the tests and helping us in data collection.

220

Authorized licensed use limited to: PUC MG. Downloaded on July 04,2023 at 18:11:23 UTC from IEEE Xplore. Restrictions apply.

You might also like