Content deleted Content added
Maxeto0910 (talk | contribs) expanded Tags: Visual edit Mobile edit Mobile web edit Advanced mobile edit |
Davide King (talk | contribs) ce, remove italics in quotes per MOS and fix some special characters (e.g. apostrophe |
||
(39 intermediate revisions by 12 users not shown) | |||
Line 1:
{{short description|Image upscaling technology by Nvidia}}▼
{{Puffery|date=March 2024}}
'''Deep learning super sampling''' ('''DLSS''') is a family of [[Real-time computing|real-time]] [[deep learning]] image enhancement and [[
▲{{short description|Image upscaling technology by Nvidia}}
▲'''Deep learning super sampling''' ('''DLSS''') is a family of [[Real-time computing|real-time]] [[deep learning]] image enhancement and [[image scaling|upscaling]] technologies developed by [[Nvidia]] that are exclusive to its [[Nvidia RTX|RTX]] line of [[graphics card]]s,<ref>{{Cite web|title=NVIDIA DLSS Technology for Incredible Performance|url=https://www.nvidia.com/en-gb/geforce/technologies/dlss/|access-date=2022-02-07|website=NVIDIA|language=en-gb}}</ref> and available in a number of [[Video game|video games]]. The goal of these technologies is to allow the majority of the [[graphics pipeline]] to run at a lower [[Display resolution|resolution]] for increased performance, and then infer a higher resolution image from this that approximates the same level of detail as if the image had been rendered at this higher resolution. This allows for higher graphical settings and/or [[frame rates]] for a given output resolution, depending on user preference.<ref name=":2">{{cite web|url=https://www.digitaltrends.com/computing/everything-you-need-to-know-about-nvidias-rtx-dlss-technology/|title=Nvidia RTX DLSS: Everything you need to know |publisher=[[Digital Trends]]|date=2020-02-14|access-date=2020-04-05|quote=''Deep learning super sampling uses artificial intelligence and machine learning to produce an image that looks like a higher-resolution image, without the rendering overhead. Nvidia’s algorithm learns from tens of thousands of rendered sequences of images that were created using a supercomputer. That trains the algorithm to be able to produce similarly beautiful images, but without requiring the graphics card to work as hard to do it.''}}</ref>
As of September 2022, the
== History ==
Nvidia advertised DLSS as a key feature of the [[GeForce 20 series]] cards when they launched in September 2018.<ref name="techspot">{{cite web|url=https://www.techspot.com/article/1992-nvidia-dlss-2020/|title=Nvidia DLSS in 2020: stunning results|publisher=techspot.com|date=2020-02-26|access-date=2020-04-05}}</ref> At that time, the results were limited to a few video games,
In April 2020, Nvidia advertised and shipped an improved version of DLSS named DLSS 2.0 with [[Device driver|driver]] version 445.75. DLSS 2.0 was available for a few existing games including ''Control'' and ''[[Wolfenstein: Youngblood]]'', and would later be added to many newly released games and [[game engine]]s such as [[Unreal Engine]] and [[Unity (game engine)|Unity]].<ref>{{Cite web|date=2021-02-11|title=NVIDIA DLSS Plugin and Reflex Now Available for Unreal Engine|url=https://developer.nvidia.com/blog/nvidia-dlss-and-reflex-now-available-for-unreal-engine-4-26/|access-date=2022-02-07|website=NVIDIA Developer Blog|language=en-US}}</ref>
=== Release history ===
Line 21 ⟶ 19:
|1.0||February 2019||Predominantly spatial image upscaler, required specifically trained for each game integration, included in ''[[Battlefield V]]'' and ''[[Metro Exodus]],'' among others<ref name="battlefieldv"/>
|-
|"1.9" (unofficial name)||August 2019||DLSS 1.0 adapted for running on the CUDA shader cores instead of tensor cores, used for ''[[Control (video game)|Control]]''<ref name="eurogamer"/><ref name="techspot"/><ref name="nividiacontrol">{{cite web |last1=Edelsten |first1=Andrew |title=NVIDIA DLSS: Control and Beyond |url=https://www.nvidia.com/en-us/geforce/news/dlss-control-and-beyond/ |publisher=
|-
|2.0||April 2020||An AI accelerated form of [[Temporal anti-aliasing|TAA]]U using Tensor Cores, and trained generically<ref name="control2">{{cite web|url=https://www.techquila.co.in/nvidia-dlss-2-control-review/|title=NVIDIA DLSS 2.0 Review with Control – Is This Magic?|publisher=techquila.co.in|date=2020-04-05|access-date=2020-04-06}}</ref>
Line 27 ⟶ 25:
|3.0
|September 2022
|DLSS 3.0, augmented with an optical flow frame
|-
|3.5
Line 35 ⟶ 33:
== Quality presets ==
When using DLSS, depending on the game, users have access to various quality presets in addition to the option to set the internally rendered, upscaled resolution
{| class="wikitable sortable"
|+'''Standard DLSS presets'''
Line 46 ⟶ 44:
|100%
|-
|Ultra Quality<ref name=":5">{{cite web |title=NVIDIA preparing Ultra Quality mode for DLSS, 2.2.9.0 version spotted |url=https://videocardz.com/newz/nvidia-preparing-ultra-quality-mode-for-dlss-2-2-9-0-version-spotted |access-date=2021-07-06 |website=VideoCardz.com |language=en-US}}</ref><sub> (unused)</sub>
|1.32x
|76.0%
Line 62 ⟶ 60:
|50.0%
|-
|Ultra Performance<sub> (since v2.1; only recommended for resolutions from [[8K resolution|8K]]</sub><ref name=":5" /><sub>)</sub>
|3.00x
|33.3%
|-
|Auto
| colspan="2" |Rendered resolution dynamically adjusts in real time to achieve user-defined
|}
Line 81 ⟶ 79:
The first iteration of DLSS is a predominantly spatial image upscaler with two stages, both relying on [[Convolutional neural network|convolutional]] [[Autoencoder|auto-encoder]] [[neural network]]s.<ref>{{Cite web|date=2018-09-19|title=DLSS: What Does It Mean for Game Developers?|url=https://developer.nvidia.com/blog/dlss-what-does-it-mean-for-game-developers/|access-date=2022-02-07|website=NVIDIA Developer Blog|language=en-US}}</ref> The first step is an image enhancement network which uses the current frame and motion vectors to perform [[edge enhancement]], and [[spatial anti-aliasing]]. The second stage is an image upscaling step which uses the single raw, low-resolution frame to upscale the image to the desired output resolution. Using just a single frame for upscaling means the neural network itself must generate a large amount of new information to produce the high resolution output, this can result in slight [[Hallucination (artificial intelligence)|hallucination]]s such as leaves that differ in style to the source content.<ref name="NVIDIA" />
The neural networks are trained on a per-game basis by generating a "perfect frame" using traditional [[supersampling]] to 64 samples per pixel, as well as the motion vectors for each frame. The data collected must be as comprehensive as possible, including as many levels, times of day, graphical settings, resolutions, etc. as possible. This data is also [[Data augmentation|augmented]] using common augmentations such as rotations, colour changes, and random noise to help generalize the test data. Training is performed on Nvidia's Saturn V supercomputer.<ref name=":1" /><ref name="nvidia10">{{cite web|url=https://www.nvidia.com/en-us/geforce/news/nvidia-dlss-your-questions-answered/|title=NVIDIA DLSS: Your Questions, Answered|publisher=[[Nvidia]]|date=2019-02-15|access-date=2020-04-19|quote=
This first iteration received a mixed response, with many criticizing the often soft appearance and artifacts in certain situations;<ref name="nvidia20">{{cite web|date=2020-03-23|title=NVIDIA DLSS 2.0: A Big Leap In AI Rendering|url=https://www.nvidia.com/en-us/geforce/news/nvidia-dlss-2-0-a-big-leap-in-ai-rendering/|access-date=2020-04-07|publisher=[[Nvidia]]}}</ref><ref name=":0" /><ref name="battlefieldv" /> likely a side effect of the limited data from only using a single frame input to the neural networks which could not be trained to perform optimally in all scenarios and [[Edge case|edge-cases]].<ref name="NVIDIA" /><ref name=":1" /> Nvidia also demonstrated the ability for the auto-encoder networks to learn the ability to recreate [[Depth of field|depth-of-field]] and [[motion blur]],<ref name=":1" /> although this functionality has never been included in a publicly released product.{{Citation needed|date=February 2022}}
=== DLSS 2.0 ===
DLSS 2.0 is a [[temporal anti-aliasing]] [[upsampling]] (TAAU) implementation, using data from previous frames extensively through sub-pixel jittering to resolve fine detail and reduce aliasing. The data DLSS 2.0 collects includes: the raw low-resolution input, [[motion vector]]s, [[Z-buffering|depth buffers]], and [[Exposure value|exposure]] / brightness information.<ref name="NVIDIA" /> It can also be used as a simpler TAA implementation where the image is rendered at 100% resolution, rather than being upsampled by DLSS, Nvidia brands this as [[DLAA]] (deep learning anti-aliasing).<ref name=":4">{{Cite web|date=2021-09-28|title=What is Nvidia DLAA? An Anti-Aliasing Explainer|url=https://www.digitaltrends.com/computing/what-is-nvidia-dlaa/|access-date=2022-02-10|website=Digital Trends|language=en}}</ref>
TAA(U) is used in many modern video games and [[game engine]]s
DLSS 2.0 uses a [[Convolutional neural network|convolutional]] [[Autoencoder|auto-encoder]] [[neural network]]<ref name="nvidia20" /> trained to identify and fix temporal artifacts, instead of manually programmed heuristics as mentioned above. Because of this, DLSS 2.0 can generally resolve detail better than other TAA and TAAU implementations, while also removing most temporal artifacts. This is why DLSS 2.0 can sometimes produce a sharper image than rendering at higher, or even native resolutions using traditional TAA. However, no temporal solution is perfect, and artifacts (ghosting in particular) are still visible in some scenarios when using DLSS 2.0.
Because temporal artifacts occur in most art styles and environments in broadly the same way, the neural network that powers DLSS 2.0 does not need to be retrained when being used in different games. Despite this, Nvidia does frequently ship new minor revisions of DLSS 2.0 with new titles,<ref>{{Cite web|title=NVIDIA DLSS DLL (2.3.7) Download|url=https://www.techpowerup.com/download/nvidia-dlss-dll/|access-date=2022-02-10|website=TechPowerUp|language=en}}</ref> so this could suggest some minor training optimizations may be performed as games are released, although Nvidia does not provide changelogs for these minor revisions to confirm this. The main advancements compared to DLSS 1.0 include: Significantly improved detail retention, a generalized neural network that does not need to be re-trained per-game, and ~2x less overhead (~1-2 ms vs ~2-4 ms).<ref name="NVIDIA" />
It should also be noted that forms of TAAU such as DLSS 2.0 are not [[Video scaler|upscalers]] in the same sense as techniques such as ESRGAN or DLSS 1.0, which attempt to create new information from a low-resolution source; instead, TAAU works to recover data from previous frames, rather than creating new data. In practice, this means low resolution [[Texture mapping|textures]] in games will still appear low-resolution when using current TAAU techniques. This is why Nvidia recommends game developers use higher resolution textures than they would normally for a given rendering resolution by applying a mip-map bias when DLSS 2.0 is enabled.<ref name="NVIDIA" />
=== DLSS 3.0 ===
Augments DLSS 2.0 by making use of [[motion interpolation]]. The DLSS frame generation algorithm takes two rendered frames from the rendering pipeline and generates a new frame that smoothly transitions between them. So for every frame rendered, one additional frame is generated.<ref name=":3" /> DLSS 3.0 makes use of a new generation Optical Flow Accelerator (OFA) included in Ada Lovelace generation RTX GPUs. The new OFA is faster and more accurate than the OFA already available in previous Turing and Ampere RTX GPUs.<ref>{{Cite web |date=2018-11-29 |title=NVIDIA Optical Flow SDK |url=https://developer.nvidia.com/opticalflow-sdk |access-date=2022-09-20 |website=NVIDIA Developer |language=en}}</ref> This results in DLSS 3.0 being exclusive for the RTX 40 Series. At release, DLSS 3.0 does not work for VR displays.{{cn|date=May 2023}}
=== DLSS 3.5 ===
DLSS 3.5 adds ray reconstruction, replacing multiple denoising algorithms with a single AI model trained on five times more data than DLSS 3. Ray reconstruction
== Anti-aliasing ==
DLSS requires and applies its own [[anti-aliasing]] method. Thus, depending on the game and quality setting used, using DLSS may improve image quality even over native resolution rendering.<ref>{{Cite web |last=Smith |first=Matthew S. |date=2023-12-28 |title=What Is DLSS and Why Does it Matter for Gaming? |url=https://www.ign.com/articles/what-is-nvidia-dlss-meaning |access-date=2024-06-13 |website=IGN |language=en}}</ref> It operates on similar principles to [[Temporal anti-aliasing|TAA]]. Like TAA, it uses information from past frames to produce the current frame. Unlike TAA, DLSS does not sample every pixel in every frame. Instead, it samples different pixels in different frames and uses pixels sampled in past frames to fill in the unsampled pixels in the current frame. DLSS uses machine learning to combine samples in the current frame and past frames, and it can be thought of as an advanced and superior TAA implementation made possible by the available tensor cores.<ref name="NVIDIA" /> [[Nvidia]] also offers [[deep learning anti-aliasing]] (DLAA). DLAA provides the same AI-driven anti-aliasing DLSS uses, but without any upscaling or downscaling functionality.<ref name=":4" />▼
▲It operates on similar principles to [[Temporal anti-aliasing|TAA]]. Like TAA, it uses information from past frames to produce the current frame. Unlike TAA, DLSS does not sample every pixel in every frame. Instead, it samples different pixels in different frames and uses pixels sampled in past frames to fill in the unsampled pixels in the current frame. DLSS uses machine learning to combine samples in the current frame and past frames, and it can be thought of as an advanced and superior TAA implementation made possible by the available tensor cores.<ref name="NVIDIA"/>
== Architecture ==
With the exception of the shader-core version implemented in ''Control'', DLSS is only available on [[GeForce 20 series|GeForce RTX 20]], [[GeForce 30 series|GeForce RTX 30]], [[GeForce 40 series|GeForce RTX 40]], and [[Quadro#Quadro RTX|Quadro RTX]] series of video cards, using dedicated [[AI accelerator]]s called '''Tensor Cores'''.<ref name="nvidia20"/>{{Failed verification|date=March 2024}} Tensor Cores are available since the Nvidia [[Volta (microarchitecture)|Volta]] [[graphics processing unit|GPU]] [[microarchitecture]], which was first used on the [[Nvidia Tesla|Tesla V100]] line of products.<ref>
{{cite web|url=https://www.tomshardware.com/news/nvidia-tensor-core-tesla-v100,34384.html|title=On Tensors, Tensorflow, And Nvidia's Latest 'Tensor Cores'|publisher=tomshardware.com|date=2017-04-11|access-date=2020-04-08}}</ref> They are used for doing [[Multiply–accumulate operation|fused multiply-add]] (FMA) operations that are used extensively in neural network calculations for applying a large series of multiplications on weights, followed by the addition of a bias. Tensor cores can operate on FP16, INT8, INT4, and INT1 data types. Each core can do 1024 bits of FMA operations per clock, so 1024 INT1, 256 INT4, 128 INT8, and 64 FP16 operations per clock per tensor core, and most Turing GPUs have a few hundred tensor cores.<ref>{{Cite web|title=
== Issues and criticism ==
Especially in early versions of DLSS, users reported blurry frames. Andrew Edelsten, an employee at Nvidia, therefore commented on the problem in a blog post in 2019 and promised that they were working on improving the technology and clarified that the DLSS AI algorithm was mainly trained with 4K image material. That the use of DLSS leads to particularly blurred images at lower resolutions, such as [[Full HD]], is due to the fact that the algorithm has far less image information available to calculate an appropriate image compared to higher resolutions like 4K.<ref>{{Cite web |title=NVIDIA DLSS: Your Questions, Answered |url=https://www.nvidia.com/en-us/geforce/news/nvidia-dlss-your-questions-answered/ |access-date=2024-07-09 |publisher=Nvidia |language=en-us}}</ref>
The use of DLSS frame generation may lead to increased [[input latency]],<ref>{{Cite web |date=2023-11-21 |title=When a high frame rate can lose you the game |url=https://www.digitaltrends.com/computing/when-frames-dont-win-games/ |access-date=2024-07-09 |website=Digital Trends |language=en}}</ref> as well as [[visual artifacts]].<ref>{{Cite web |date=2023-03-08 |title=Nvidia DLSS 3 Revisit: We Try It Out in 9 Games |url=https://www.techspot.com/article/2639-dlss-3-revisit/ |access-date=2024-07-09 |website=TechSpot |language=en-US}}</ref> It has also been criticized that by implementing DLSS in their games, game developers no longer have an incentive to optimize them so that they also run smoothly in native resolution on modern PC hardware. For example, for the game ''[[Alan Wake 2]]'' in [[4K resolution]] at the highest graphics settings with [[Ray tracing (graphics)|ray tracing]] enabled, the use of DLSS in Performance mode is recommended even with current-generation high-end graphics cards such as the [[Nvidia GeForce RTX 4080]] in order to achieve 60 fps.<ref>{{Cite web |date=2023-10-26 |title=Alan Wake 2 on PC is an embarrassment of riches |url=https://www.digitaltrends.com/computing/alan-wake-2-pc-performance/ |access-date=2024-07-09 |website=Digital Trends |language=en}}</ref>
== See also ==
* [[GPUOpen#FidelityFX Super Resolution|FidelityFX Super Resolution]] – competing
* [[Intel XeSS]] – competing technology from [[Intel]]
* [[PlayStation Spectral Super Resolution]] – similar technology from [[Sony]]
== References ==
Line 137 ⟶ 126:
{{NVIDIA}}
[[Category:3D computer graphics]]
[[Category:Nvidia]]
[[Category:Anti-aliasing algorithms]]
|