0% found this document useful (0 votes)
152 views6 pages

A Comparison of Performance On WebGPU and WebGL in The Godot Game Engine

Uploaded by

g60836752
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
152 views6 pages

A Comparison of Performance On WebGPU and WebGL in The Godot Game Engine

Uploaded by

g60836752
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 6

2024 IEEE Gaming, Entertainment, and Media Conference (GEM)

A comparison of Performance on WebGPU and


WebGL in the Godot game engine
1st Emil Fransson 2nd Jonatan Hermansson 3rd Yan Hu
Department of Computer Science Department of Computer Science Department of Computer Science
Blekinge Institute of Technology Blekinge Institute of Technology Blekinge Institute of Technology
Karlskrona, Sweden Karlskrona, Sweden Karlskrona, Sweden
[email protected] [email protected] [email protected]

Abstract—WebGL has been the standard API for rendering in medicine and geospatial applications. WebGL currently
graphics on the web over the years. A new technology, WebGPU, exists as a possible rendering backend in the Godot engine
2024 IEEE Gaming, Entertainment, and Media Conference (GEM) | 979-8-3503-7453-7/24/$31.00 ©2024 IEEE | DOI: 10.1109/GEM61861.2024.10585437

has been set to release in 2023 and utilizes many of the for rendering graphics on the web platform.
novel rendering approaches and features common for the native
modern graphics APIs, such as Vulkan. Currently, very limited This paper consists of an implementation of a rendering
research exists regarding WebGPU’s rasterization capabilities. backend for the game engine Godot using the currently latest
In particular, no research exists about its capabilities when low-level web graphics API WebGPU, and comparing its
used as a rendering backend in game engines. This paper performance in various test cases to the performance of the
aims to investigate performance differences between WebGL and WebGL backend currently implemented in Godot. WebGPU
WebGPU. It is done in the context of the game engine Godot,
and the measured performance is that of the CPU and GPU is a new graphics API that aims to bring a more modern API
frame time. The results show that WebGPU performs better than workflow to web platforms, with its first draft of specifications
WebGL when used as a rendering backend in Godot, for both being released in 2021 [7]. Like the previously mentioned
the games tests and the synthetic tests. The comparisons clearly modern APIs, it aims to enable the developer to work closer to
show that WebGPU performs faster in mean CPU and GPU the hardware of the machine it is running on. The API utilized
frame time.
Index Terms—Game Engine, Performance Overhead, Render- by the web browser is determined by the operating system
ing, WebGPU, WebGL on which it is executed. Depending on the specifications of
the system, the web browser may utilize either the Direct3D
I. I NTRODUCTION 12, Vulkan, or Metal APIs. As with these APIs, WebGPU
provides developers with relatively direct access to previously
Modern video games leverage sophisticated graphics appli- inaccessible low-level GPU resources. It also employs a state-
cation programming interfaces (APIs) to render highly detailed less syntax, which leads to fewer API calls, invoking less API
worlds. They accomplish this at interactive frame rates by overhead when compared to the stateful syntax of WebGL,
utilizing powerful graphics processing units (GPUs) equipped inherited by OpenGL.
with modern computers. Commonly used APIs include Di- Section 2 lists some related work in the areas of WebGPU.
rect3D [1] for machines running Windows, Metal [2] for Section 3 details the overall research method including the
Apple products, and Vulkan [3] and OpenGL [4] as a cross- implementation details and how the experiment and data
platform alternative. Those APIs all target native platforms, gathering were conducted. Section 4 presents the results and
and as is evident, many choices are available to developers. analysis of the conducted experiment. Section 5 contains
However, when it comes to rendering on the web, the choices a discussion of the performed work, and the final section
narrow significantly. WebGL was the lowest-level alternative presents the conclusions and future work.
for rendering on the web [5]. It is based on the aforementioned
OpenGL native API and adopts the same workflow and syntax. II. R ELATED W ORK
WebGL is a cross-platform, open-source API for rendering A study by Hidaka et al. found that their implementation
interactive 2D and 3D graphics on the web, with an initial of a deep neural network (DNN) using WebGPU performed
release in March 2011. A typical WebGL program consists around 36 times faster (91 ms over 3297 ms) compared to
of JavaScript-written control code and shader code facilitated another popular DNN implementation for the web that makes
by the OpenGL Shading Language (GLSL). Additionally, Em- use of the emulated compute capabilities of WebGL [8].
scripten may compile C/C++ OpenGL code into WebAssem- Aldahir researched the compute performance differences
bly, allowing the WebGL API to be interacted with through (Mandelbrot set generation and matrix multiplication) of
lower-level languages [6]. WebGL is a mature API supported CUDA and WebGPU, with WebGPU set up to run compute op-
by many different hardware products and browsers. It has been erations in a cluster of web browsers. The results showed that
applied in many environments and fields, such as rendering CUDA is faster and more efficient than WebGPU. However,
backends in the gaming industry and for visualization purposes the authors added that WebGPU is still in early development

Authorized licensed use limited to: Zhejiang University. Downloaded on January 01,2025 at 02:28:27 UTC from IEEE Xplore. Restrictions apply.
979-8-3503-7453-7/24/$31.00 ©2024 IEEE
and hence not as stable and mature as CUDA. Also, WebGPU, 1) The shaders used must be as close as possible in terms
along with WebRTC, displayed good scalability with over 75% of instruction count, branching, and operations. Exactly
efficiency for building clusters of web browsers [9]. the same work must be done in the shaders.
Usher and Pascucci compared the computing capabilities 2) The shader pressure, in terms of data types and data
of WebGPU with those of native Vulkan. In the paper, the layout, must be as close as possible.
marching cubes algorithm applied on a scalar field was used 3) No optimizations are allowed for the WebGPU Raster-
as a proxy for compute-intensive tasks. The results display izer on the CPU-side or GPU-side, which would put it
similar performance with WebGPU falling in the same order of at an unfair advantage over the WebGL Rasterizer.
magnitude and often even closer to the Vulkan implementation 4) The CPU workflow must be as identical as possible in
in terms of time-to-render [10]. terms of computations and branching.
Dyken et al. investigated the relative performance of ren- 5) The run time allocations should be as identical as
dering large-scale graph layouts on the web using libraries possible.
based on WebGPU (GraphWaGu), WebGL (NetV & Stardust), To achieve the prerequisites, the work began with deconstruct-
and non-GPU-accelerated equivalents (such as D3 Canvas). ing the WebGL Rasterizer to a state where it would match the
GraphWaGu is the only GPU-leveraged library that is able MVP aimed for as close as possible; the Rasterizer should be
to compute iterations of the graph algorithms in parallel. So able to render simple 2D games of predetermined complexity
at 100.000 nodes and 2.000.000 edges, only GraphWaGu is and nothing more. In order for measurements between the
able to maintain interactive rendering at a frame rate of ten performance of the two APIs to be as fair as possible, the
or more. The equivalent frame rate for NetV is three, with WebGPU rendering backend has to adhere to the rendering
StarDust being unable to render the graph layout at all [11]. techniques that Godot employs. The techniques that concern
There has been quite some research done in the field of the scaled-down version of the Rasterizer backend include
WebGPU and its general computing capabilities. However, this batching and instancing as well as the forcing of render
does not hold true for WebGPU and its rasterization capability target blitting. Batching is a technique used to group similar
counterpart, in particular research involving comparisons of items and render them together to avoid unnecessary resource
WebGPU and WebGL. Furthermore, at the time of doing binding. For blitting, a separate pipeline was set up with a
this study, no research could be found that places its context vertex shader that simply renders a triangle covering the entire
inside the environment of a game engine. The work presented back buffer, and a fragment shader that textures this triangle
in this paper aims to effectively reduce the research gap on using the main render target texture.
WebGPU as a new rasterization technology for the web in the
environment of the Godot game engine, grounding the research B. Experiment and Data Gathering
and results in real usability scenarios. When it comes to the performance of games and graphical
scenes the general consensus of how well something performs
III. M ETHODS is how smooth it appears to run to the human eye. The
Godot is an open-source game engine first released in gathered data in the conducted experiment is that of the frame
2015. It has since had many updates and the newest version, time measured in milliseconds. As the WebGL backend and
4.0, was recently released as of doing this study [12], with implemented WebGPU backend spans over both the CPU and
many new features and an entirely new rendering pipeline GPU in terms of work performed, both the CPU work times
leveraging the aforementioned Vulkan API, along with a host and GPU work times are measured. The time gathered is for
of updates to the existing legacy rendering backends. Godot a full frame for the CPU and GPU.
is multifaceted in the advantages it affords the work when The timings are gathered as averages over 2000 frames.
used as a foundation for implementing a rendering backend. The measurements of elapsed time on the CPU for the
Firstly, a pre-established architecture can be followed during various scopes was measured by using the C++ standard
implementation, keeping comparisons between rendering APIs library’s chrono header. A timestamp was acquired from
fair. Secondly, the currently implemented WebGL rendering chrono::high_resolution_clock at the start of the
backend can be assumed to be fairly well optimized and thus relevant scope and another one at the end of it. To calculate
serves as a good benchmark for the performance of WebGL how much time elapsed, the start time stamp was subtracted
rendering engines in the industry. The reason for choosing from the end one. This elapsed time was then stored in a
Godot over another game engine mainly comes down to its vector and used later when enough samples have been gathered
open-source nature. to calculate an average elapsed time. For measuring time on
the GPU, different methods need to be used for the different
A. Implementation APIs. WebGL provides a way of measuring the elapsed time
In order for the implemented WebGPU Rasterizer and the between two points, whereas WebGPU provides a way to
existing WebGL Rasterizer to be eligible for performance queue a timestamp on the command encoder. If one timestamp
comparisons, the overall computation work they do must be as is acquired at the start of a frame and one at the end, the
identical as possible. More precisely, these prerequisites must elapsed time can be acquired in the same way as described
be aimed for: for the CPU measurements.

Authorized licensed use limited to: Zhejiang University. Downloaded on January 01,2025 at 02:28:27 UTC from IEEE Xplore. Restrictions apply.
The experiments include two categories: simple 2D games will increase significantly with each test increasing the
and synthetic tests. For the category of simple 2D games six polygon count, up to and including 16 million vertices.
different games that are simple in scope and complexity were
C. Hardware and Software Specification
selected. As the Rasterizers are limited in scope, and as the
games must be supported by the Godot version used in this The hardware as well as what versions of relevant graphics
work, the games were selected purely based on the engine’s drivers were used are presented in Table I.
and the two Rasterizers’ ability to support and render them.
TABLE I
The games are: I NFORMATION ABOUT HARDWARE AND SOFTWARE VERSIONS OF THE
1) Snake [13], in which the player must avoid obstacles and MACHINE UPON WHICH ALL TEST CASES WERE RUN .
gather apples in order for the snake character to grow Component
longer and longer. CPU Intel Core i7 12700H, 2.7GHz
GPU NVIDIA GeForce RTX 3070 Ti (Laptop Version), 8GB GDDR6
2) Evader [14], in which the player must avoid incoming Memory SK Hynix, 2x8GB DDR4, 3.2GHz
Disk Samsung MZVL21T0HCLR-00B07, 1TB, 7.0/5.1 GB/s
shapes on the highway. Monitor Resolution 2560x1440
3) Checkers1 [15], in which the player plays the checkers Monitor Refresh Rate 165Hz
Operating System Windows 11 Home 22H2
game either versus an AI or optionally versus another NVIDIA Driver Version 531.41
Emscripten 3.1.30
player locally. Chrome Canary 114.0.5715.1
4) Falling Cats [16], in which the player must catch cats Godot Engine Version 4.0

falling from a tree before they hit the ground.


5) Deck Before Dawn [17], in which the player strategically IV. R ESULTS
plays a number of cards every turn with abilities in order A. Performance Comparison of Game Tests
to defend a sleeping child from nightmare creatures.
6) Ponder2 [18], in which the player must navigate a duck The GPU frame time is firstly presented and the second the
character in a finite number of sequences in order to total CPU frame time that measured. Along with average frame
collect all ducklings. times, the means of the 1% highest and the 95% lowest frame
times are calculated to be able to analyze the performance
The synthetic tests are applied in order to test specific areas
consistency between the two Rasterizers.
of rendering and how the Rasterizers compare for each one.
As such the synthetic tests are further split into four categories
for each specific test case. The synthetic test categories are:
1) Multiple Quads — Multiple tiny textured sprites:
The test consists of one big draw call of one batch
consisting of instances of textured sprites in the order
of 10, 100, 1000, 10000, 20000, 30000, 40000, and
50000 sprites rendered on screen simultaneously, using
the shaders for rendering the quad render item type.
2) Full-screen quads — Multiple full screen textured
sprites:
The test and details regarding it are identical to the
aforementioned test with the sole difference that every Fig. 1. Comparison of the mean WebGL and WebGPU GPU frame times, in
sprite now is full screen sized. milliseconds, for the various games.
3) Multiple Polygons — Multiple tiny polygons:
The test consists of one batch per polygon (as Godot
has every polygon forming its own batch) in the order of
10, 100, 1000, 10000, 20000, 30000, 40000, and 50000
polygons rendered on screen simultaneously, using the
shaders for rendering the polygon render item type.
4) Large polygons — A few polygons, each with 50000
vertices:
The test consists of one batch per polygon in the order
of 40, 80, 120, 160, 200, 240, 280, and 320 polygons
rendered on the screen simultaneously, using the shaders
for polygons just like the aforementioned test. The
aim of the test is to put considerate pressure on the Fig. 2. Comparison of the highest 1% mean and the lowest 95% mean WebGL
vertex shader stage as the number of vertices to process and WebGPU GPU frame times, in milliseconds, for the various games.

1 The version used in testing is v1.0.1-0-g7a4203b 1) GPU Frame Time: In Figure 1, it can be seen that
2 The version used for testing is v1.0.0 WebGPU on average has much shorter GPU frame times than

Authorized licensed use limited to: Zhejiang University. Downloaded on January 01,2025 at 02:28:27 UTC from IEEE Xplore. Restrictions apply.
WebGL in all games that were included in the test. Further- B. Performance Comparison of Synthetic Tests
more, a speed-up of WebGPU to WebGL ranges between 1) GPU Frame Time: For the synthetic test involving
6.822, in the case of Ponder, and 35.611, in the case of Evader. rendering multiple quads WebGPU outperforms WebGL in all
Figure 2 shows that the difference between the lowest 95% of cases in GPU mean frame times, as can be clearly seen in
frame times and the highest 1% is larger for WebGL. However, Figure 5. The speed-up factor ranges from 4.588, as is the
for Checkers and Ponder and Falling Cats, the percentage case when rendering 40 000 quads, up to 9.039, as is the case
difference is more significant for WebGPU. For checkers, this when rendering ten quads. The results of rendering multiple
comes out to a 7.020 times increase for WebGPU compared to
a 4.641 times increase for WebGL. For Ponder, the increase is
4.748 times for WebGPU and 4.292 times for WebGL. Lastly,
for Falling Cats, WebGPU shows a 1.613 times increase and
WebGL shows a 1.611 times increase. For the other games,
WebGPU has a smaller spread in absolute and percentage
terms.

Fig. 5. Comparison of the mean WebGL and WebGPU GPU frame times, in
milliseconds, for the Multiple Quads test. The workloads range from 10 to
50.000 quads.

full-screen quads show how considerate pressure was put on


both Rasterizers, with long GPU mean frame times for all
tests above 1000 quads. In Figure 6, both Rasterizers show an
approximately linear increase in frame time as the number of
Fig. 3. Comparison of the mean WebGL and WebGPU CPU frame times, in quads increases, with WebGPU being roughly 1.8 - 3.1 times
milliseconds, for the various games. faster than WebGL depending on the profiling context.

Fig. 6. Comparison of the mean WebGL and WebGPU GPU frame times, in
milliseconds, for the Full-screen quads test. The workloads range from 10 to
50.000 full-screen quads.
Fig. 4. Comparison of the highest 1% mean and lowest 95% mean WebGL
and WebGPU CPU frame times, in milliseconds, for the various games.
The Polygons synthetic tests show the biggest comparative
GPU frame time differences between the two Rasterizers,
2) CPU Frame Time: In Figure 3, it is shown that WebGPU with WebGPU vastly outperforming WebGL in every case.
has shorter mean frame times for all of the game tests As an example, in Figure 7, at the point of rendering 50 000
compared to WebGL. Deck Before Dawn is a clear outlier polygons WebGL manages an average of 150.94 milliseconds
in the data set in terms of how much shorter the CPU frame per frame while WebGPU is still running at passable real-time
time is with the WebGPU implementation. Figure 4 shows speeds (15.75 milliseconds, equivalent to more than 60 frames
that the percentage differences between the lowest 95% and per second). The GPU frame times for the Large Polygons
highest 1% of frame times are typically lower compared to synthetic test show that WebGPU is roughly 2 - 3 times
the spread documented for GPU frame times in Figure 2. This faster across various workloads. The frame time increases
does, however, not hold true for all cases. For instance, Evader roughly linearly for the Rasterizers with greater workloads,
shows a larger spread for WebGPU in CPU frame time than with a statistical deviation occurring at 4 million polygons
it did for the GPU. for WebGL. Like with the test of rendering multiple quads,

Authorized licensed use limited to: Zhejiang University. Downloaded on January 01,2025 at 02:28:27 UTC from IEEE Xplore. Restrictions apply.
steady increase for all workloads below 50 000 quads, the 50
000 quads variant has a much larger frame time than the 40
000 quads variant. The frame time for this variant looks very
similar to the GPU frame time presented in Figure 6.

Fig. 7. Comparison of the mean WebGL and WebGPU GPU frame times, in
milliseconds, for the Multiple Polygons test. The workloads range from 10 to
50.000 polygons.

WebGL again shows a bigger spread of frame times, with


WebGPU remaining fairly stable (see Figure 8). Fig. 10. Comparison of the mean WebGL and WebGPU CPU frame times,
in milliseconds, for the Full-screen Quads test. The workloads range from 10
to 50.000 full-screen quads.

The total CPU frame time of the Polygons synthetic tests


shows that outside of the 10 polygon test case, the WebGPU
implementation is faster than the WebGL one (Figure 11). It
also shows that an increase in rendered polygons causes a
nearly linear increase in frame time, where the increase for
the WebGL Rasterizer is steeper than that of the WebGPU
one. Notably, for WebGPU, the step from 40 000 polygons
to 50 000 polygons breaks the previous pseudo-linearity and
causes the frame time to increase more significantly.

Fig. 8. Comparison of the mean WebGL and WebGPU GPU frame times,
in milliseconds, for the Large Polygons test. The workloads range from 2
million to 16 million vertices.

2) CPU Frame Time: For the Quads synthetic test, Figure


9 shows that WebGPU performs better in total CPU frame
time compared to WebGL. It is also shown in the graph that
the gap between them, with the exception of the 30 000 quads
variant, increases the more items are being rendered.

Fig. 11. Comparison of the mean WebGL and WebGPU CPU frame times,
in milliseconds, for the Multiple Polygons test. The workloads range from 10
to 50.000 polygons.

In the Large Polygons synthetic test, the total CPU time


recorded for the two Rasterizers in Figure 12 shows that
WebGPU has a very small steady increase following an
increase in workload, whereas the WebGL implementation
varies seemingly settling into a steady increase after 12 million
vertices (240 polygons).
Fig. 9. Comparison of the mean WebGL and WebGPU CPU frame times, in We used paired T-tests as the main statistical significance
milliseconds, for the Multiple Quads test. The workloads range from 10 to
50.000 quads. tests in our study. The p-value of all the comparisons of overall
mean values of GPU and CPU frame times are less than
For the Full-screen Quads synthetic tests, WebGL can be 0.05, which means it is significantly different. In all games,
seen to have a mostly linear increase in mean CPU frame time WebGPU outperforms WebGL both in terms of overall CPU
following from the number of quads that need to be rendered, and GPU mean frame times. The synthetic tests show similar
see Figure 10. However, for WebGPU, while there is a fairly results, with WebGPU outperforming WebGL in both CPU and

Authorized licensed use limited to: Zhejiang University. Downloaded on January 01,2025 at 02:28:27 UTC from IEEE Xplore. Restrictions apply.
The work was grounded in both game examples with realistic
workloads and raw stress tests of varying workloads through
synthetic experiments. The results presented show how the
WebGPU implementation, in its current state, consistently
performs better than the WebGL equivalent. It does so across
all conducted experiments in terms of total mean CPU and
GPU frame time. Furthermore, and in general, the presented
results are statistically significant. The WebGPU renderer
implementation is relatively naive, the better results could
be achieved with a more modern graphics API workflow. A
notable suggestion for future research is to investigate the GPU
Fig. 12. Comparison of the mean WebGL and WebGPU CPU frame times, VRAM usage by both WebGL and WebGPU, if and when this
in milliseconds, for the Large Polygons test. The workloads range from 2 feature eventually becomes available for WebGPU. Another
million to 16 million vertices.
suggestion for future research is to build upon the work in this
study in order to have the WebGPU Rasterizer more feature
GPU frame time performance. This is especially prominent in rich. This would mainly involve adding support for additional
the case of rendering multiple polygons and multiple quads. render item types and complementing the 2D Canvas Renderer
WebGL also shows more fluctuating frame times, as heavily with the 3D Scene Renderer.
evident by the multiple quads and the large polygons tests. R EFERENCES
V. D ISCUSSION [1] “Direct3D - Win32 apps,” Microsoft, Sep. 2021. [Online]. Available:
https://learn.microsoft.com/en-us/windows/win32/direct3d
There are several explanations as to why WebGPU performs [2] “Metal Overview,” Apple Inc. [Online]. Available:
better than WebGL at the rendering tasks presented in the https://developer.apple.com/metal/
[3] Vulkan, “Vulkan Cross platform 3D Graphics,” Khronos Group.
study. One is due to the use of modern graphics drivers and [Online]. Available: https://www.vulkan.org/
bundled state. These provide optimizations that WebGL or [4] “OpenGL - The Industry Standard for High Performance Graphics,”
OpenGL cannot achieve, and explain the superior performance Khronos Group. [Online]. Available: https://www.opengl.org/
[5] A. Evans, M. Romeo, A. Bahrehmand, J. Agenjo, and J. Blat, “3D
of WebGPU. Despite WebGPU showing consistently better graphics on the web: A survey,” Computers & Graphics, vol. 41, pp.
frame times than WebGL, there are still times when it strug- 43–61, 2014.
gles. For instance, the CPU frame time for the 50,000 full- [6] D. Liu, J. Peng, Y. Wang, M. Huang, Q. He, Y. Yan, B. Ma,
C. Yue, and Y. Xie, “Implementation of interactive three-dimensional
screen quads synthetic test increases significantly compared visualization of air pollutants using WebGL,” Environmental Modelling
to the 40 000 full-screen quads due to data uploading to & Software, vol. 114, pp. 188–194, Apr. 2019. [Online]. Available:
the instance buffer. This function call may force CPU and https://linkinghub.elsevier.com/retrieve/pii/S1364815218304195
[7] “WebGPU,” W3C, 2021. [Online]. Available:
GPU synchronization, leading to longer CPU times and longer https://www.w3.org/TR/2021/WD-webgpu-20210518/
GPU frame times. The reason for synchronization not being [8] M. Hidaka, Y. Kikura, Y. Ushiku, and T. Harada, “WebDNN: Fastest
necessary for other variants is unknown and requires further DNN execution framework on web browser,” in MM 2017 - Proceedings
of the 2017 ACM Multimedia Conference, 2017, pp. 1213–1216.
study. [9] A. Aldahir, “Evaluation of the performance of webGPU
Aside from the already discussed performance benefits in a cluster of web-browsers for scientific computing,”
inherent to WebGPU as a modern technology, there exist other Bachelor’s thesis, Umeå University, 2022. [Online]. Available:
http://urn.kb.se/resolve?urn=urn:nbn:se:umu:diva-197058
possible explanations as to why the performance of WebGL [10] W. Usher and V. Pascucci, “Interactive Visualization of Terascale Data
falls behind WebGPU in the experiments conducted. One of in the Browser: Fact or Fiction?” in 2020 IEEE 10th Symposium on
the reasons is related to stalling. WebGL experiences different Large Data Analysis and Visualization (LDAV), 2020, pp. 27–36.
[11] L. Dyken, P. Poudel, W. Usher, S. Petruzza, J. Y. Chen, and
types of stalling at varying workloads, while WebGPU does S. Kumar, “Graphwagu: Gpu powered large scale graph layout
not. In the case of larger workloads, WebGL is several hundred computation and rendering for the web,” in Eurographics Symposium
milliseconds slower than WebGPU in mean GPU frame times. on Parallel Graphics and Visualization, 2022. [Online]. Available:
https://diglib.eg.org/xmlui/bitstream/handle/10.2312/pgv20221067/073-
On the other hand, the Polygons tests show that the GPU is 083.pdf?sequence=1
stalled instead as the workload increases. The CPU stalls as [12] “Godot 4.0 sets sail: All aboard for new horizons,” Godot. [Online].
it waits for WebGL GPU instructions to complete, which is Available: https://godotengine.org/article/godot-4-0-sets-sail/
[13] P. Hex, “Snake in Godot4,” Itch.io. [Online]. Available:
reflected in the exceptionally high mean CPU frame times in https://hexblit.itch.io/snake-in-godot4
WebGL. [14] MohamedA.G, “Evader,” Itch.io. [Online]. Available:
https://mohamedag.itch.io/evader
VI. C ONCLUSIONS AND F UTURE W ORK [15] Aezart, “Snake,” Itch.io. [Online]. Available:
https://aezart.itch.io/checkers
This paper has investigated the relative performance of two [16] angelchama333, “Falling Cats,” Itch.io. [Online]. Available:
Rasterizers based on two different rendering APIs: WebGL https://angelchama333.itch.io/falling-cats
[17] ShoeFisherGames, “Deck Before Dawn,” Itch.io. [Online]. Available:
and WebGPU. This was done by implementing a WebGPU https://shoefishergames.itch.io/deck-before-dawn
Rasterizer backend and comparing it with the existing WebGL [18] ceruleancerise, “Ponder,” Itch.io. [Online]. Available:
Rasterizer backend in the context of the Godot game engine. https://ceruleancerise.itch.io/ponder

Authorized licensed use limited to: Zhejiang University. Downloaded on January 01,2025 at 02:28:27 UTC from IEEE Xplore. Restrictions apply.

You might also like