During a behind closed doors presentation with the press, NVIDIA shared the first concrete performance figures for the GeForce RTX 4090 GPU, due to launch on October 12th. Whereas the public presentation only used vague reference points, we got actual FPS numbers in this slide, based on 4K resolution tests with DLSS Super Resolution set on Performance Mode. The CPU used was an Intel i9 12900K, equipped with 32GB RAM and running on Windows 11 64-bit.
All the tests listed below have been made in games with DLSS 3 support, highlighting the importance of this new technology to achieve maximum performance. NVIDIA then divided the results into two brackets: today’s games and tomorrow’s games.
- Cyberpunk 2077 (Max RT – Overdrive Mode) – 90 FPS
- NVIDIA Racer RTX (Full RT) – 80 FPS
- Justice (Full RT) – 81 FPS
- Portal RTX (Full RT) – 117 FPS
As you can see, the performance gains registered by the GeForce RTX 4090 are more significant in games that support more advanced ray tracing implementations. For instance, games without RT enabled like Microsoft Flight Simulator and Warhammer 40,000: Darktide only get 2X performance boosts, but the UE5 demo and F1 22 aren’t far from 3X increases, while the Unity demo and Cyberpunk 2077 are close to 4X increases.
CD Projekt RED’s game will soon be updated with the Overdrive Mode, which adds more complex RT calculations. In Overdrive Mode, the GeForce RTX 4090 runs 4X faster with DLSS 3 enabled, while Racer RTX is around 4.5X, Justice is almost at 5X, and Portal RTX isn’t far from a 6X performance boost.
NVIDIA also discussed several hardware additions made specifically to optimize ray tracing on the new Ada Lovelace architecture. That’s because since the first introduction of ray tracing to Battlefield V, which only required 39 RT operations per pixel, the most advanced titles have become much more taxing. The upcoming Cyberpunk 2077 Overdrive Mode, for example, requires 635 RT operations per pixel.
The first and foremost addition is called Shader Execution Reordering. As explained by NVIDIA’s Senior VP of GPU Engineering Jonah Alben, in a ray traced game, if some rays hit different areas of the scene, then they won’t be able to run the same program and will therefore have to sit idle while the first ray is finishing.
SER helps by adding a new stage into the ray tracing pipeline where rays that should run on the same program are grouped together, gaining efficiency.
According to NVIDIA, SER enables significant performance boosts in Cyberpunk 2077: Overdrive Mode (+44%), Portal RTX (+29%), and Racer RTX (+20%).
Another innovation introduced in the GeForce RTX 4090 and the Ada Lovelace architecture as a whole is Displaced Micro-Meshes (DMM). Once again, it’s targeted at ray tracing optimization for geometry. The new third-generation RT core is capable of understanding and processing an optimized BVH (bounding volume hierarchy). As such, BVH build performance is improved and storage requirements are decreased.
Displaced Micro-Meshes will be supported by both Simplygon and Adobe tools.
Lastly, Opacity Micro-Maps make it easier for RT cores to understand how irregular objects should be affected by rays. That’s achieved through opacity masks that include predetermined opacity states like translucent, opaque, or unknown. As such, OMMs can save a trip back to the Streaming Multiprocessors (SMs) and therefore improve performance, in the case below, by 10%.
We’ve sent a few additional questions to NVIDIA about DLSS 3, SER, DMM, and OMM. We’ll be reporting once we hear back.
Products mentioned in this post