A Brief Introduction of DLSS(Deep Learning Super Sampling)

6 min readJul 12, 2020

In recent days,although COVID-19 has significantly blocked the developing progress in all most all the areas,game industry still managed to come out with many exciting outcomes.The most exciting things should be the Unreal Engine 5 and Play Station 5 with its amazing fast SSD.But as a game favor and a rookie in AI technology,the DLSS 2.0 released earlier than the above two attracted me more.In China,the DLSS is call as 大力水手(Da Li Shui Shou),a cartoon character — Popeye the Sailor.I like the short call because DLSS for the video card is just like the popeye for the sailor.An unfortunate thing is this technology itself is a private project protected by Nvidia so I’m not able to see the details of it.But I think it is good to know the trend of my interested area and this might give some inspiration.So if you are a gamer like me,I hope this aritcle can some how give some introductions of DLSS.

DLSS was firstly released by Nvidia together with the RTX 2000 Series Video Card,aiming to change the traditional anti-aliasing methods by throurough taking use of the new processing units of the RTX 2000 Series Video Card — Tensor cores in the frame of turing.Compared with the traditional method,DLSS make it possible for users who owned the middle-level video card(RTX1660Ti,RTX2060,etc) to enjoy the 4K game with realistic game play equiped with ray-track tech and higher framrate.

Let’s see the performance first.In April 2020,two 3A games updated the supports for DLSS 2.0 — Control and Wolfenstein：Youngblood.I tested and made a comparison on the game Control with RTX 2080 Super in 4K resolution and fully turn on the ray-trace.The performance comparison is as the picture:

Performance Comparison in Control(Static)

As we can see,in the static frame of indoor scenario,the framerate rises about 70% percent while the load of the hardware keeps the same.If we take look at the portrait on the wall.which is considered as a long perspective object,it is clearer to see.

Performance Comparison in Control(Motional)

In the game play fighting scenes,we can see the resource taken when DLSS on is even less.But I’m not a professional tester so this might be a special sample,the objects is not too many so this might lead to the result.Actually, the game play is better than I can show you in a static picture.A very user-friendly change is I can feel the direction turning is obviously more smoothly.This would be a very important supports for FPS games.Usually the speed of dynamic loading those graphic resource will limit the width of the angle of the player view(usually less than 120,which is considered the normal view angle of real life).Also in the process of turning direction,developers usually applied the motion blur to make the low framerate motions “seem” to be smooth by our brain.However,this kind of cheating together with a limited view angle give a huge group of players the sense of dizzy.Just my personal opinion,DLSS maybe can save the particular group of players in FPS games.Additionally,it might allow FPS esports player perform better.

The improvement on the middle-level video card is more significant.In 4K and fully ray-trace,the performance jump from “not even able to play(9 FPS)” to “smooth(more than 30 FPS)” .Below is the test I found from a game developer:

Average Framerate in Different Video Card

Now I’m sure you can understand the power of DLSS.So what changes does it make compared with the traditional anti-aliasing?The core of anti-aliasing is actually a kind of Data Augmentation.In traditional methods(for your interests to know the details,see https://learnopengl.com/Advanced-OpenGL/Anti-Aliasing).Firstly we have to understand that in game play,the real-time frames are generated according to protype models or there would never be enough space to put all the resources in all conditions.In general,traditional method will upscale the model first,and based on some logic calculators and algorithms.

The calculation is mainly based on cuda core on GPU.However,cuda core in game enviroment is a critical resource that many other threads need to seize their own enough parts especially in a real-time graphic processing.In this limitation,those 3A games have to require users to be equipped with more powerful and expensive video card or make some confronts to lower the graphic quality.

With the new framework of the RTX2000 Series video card,the Tensor cores can support deep learning models processing.The super computer on Nvidia side would train the model with original high resolution graphics gathered from game developer and complile the model in the video card drive.In this case,when processing the frame,the anti-aliasing part will no longer rely on the cuda cores,but get the outputs from the models.In the best situation,it can get the higher quality graphics in a higher speed.

According to Nvidia official,DLSS 2.0 has two primary inputs into the AI network:

Low resolution, aliased images rendered by the game engine
Low resolution, motion vectors from the same images — also generated by the game engine

Motion vectors tell which direction objects in the scene are moving from frame to frame. It can apply these vectors to the previous high resolution output to estimate what the next frame will look like. They refer to this process as ‘temporal feedback,’ as it uses history to inform the future.

A special type of AI network, convolutional autoencoder, takes the low resolution current frame, and the high resolution previous frame, to determine on a pixel-by-pixel basis how to generate a higher quality current frame.

During the training process, the output image is compared to an offline rendered, ultra-high quality 16K reference image, and the difference is communicated back into the network so that it can continue to learn and improve its results. This process is repeated tens of thousands of times on the supercomputer until the network reliably outputs high quality, high resolution images.

Once the network is trained, NGX delivers the AI model to your GeForce RTX PC or laptop via Game Ready Drivers and OTA updates. With Turing’s Tensor Cores delivering up to 110 teraflops of dedicated AI horsepower, the DLSS network can be run in real-time simultaneously with an intensive 3D game. This simply wasn’t possible before Turing and Tensor Cores.

Even the technology is still new like a baby,the suvival crysis has already come.Firstly the technology has its limitation:till now the output won’t be satisfying unless they train one model for one game.This would cause an unacceptable cost if push DLSS to the whole game industry,and they would need to cooperate with the game developers at a very early stage.Another comes from the outside.Epic has just published the Unreal Engine 5 and it also give supports to raise the graphic quality,and the demo is really impressive.My understanding is Epic’s ambition is to jump out of the control of Nvidia,the unique of GPU hardware and that’s why they receive 350 millions USD from sony and emphasis how fast ssd from sony make the game play better.Anyhow,I’m looking forward to see the final version of DLSS and see how can it change the game industry.

In this article,I shared my personal explore on the DLSS technology.Although now the technology are not fully open-sourced,many senior AI engineers/scientists have thrown out their guessings and I am also learning on that.My thought is even I cannot directly use this technology in game play,the similar technology still can be used in image/video processing to make the dataset or analysis process better.If I can learn more and on this and come out with some results,I would also share it here.

A Brief Introduction of DLSS(Deep Learning Super Sampling)

Sign up to discover human stories that deepen your understanding of the world.

Free

Membership

Written by wizounovziki

No responses yet