This post has been republished via RSS; it originally appeared at: Microsoft Developer Blogs.Gears Tactics is the very first game to ship with support for DirectX 12 Variable Rate Shading (VRS), one of the major features in DirectX 12 Ultimate. VRS let Gears Tactics achieve large performance gains – up to 18.9% (!) – on a wide range of hardware with a minimal impact on visual quality. Check out this guest post where Jacob Nelson and Cam McRae share implementation details, performance data and future ways to expand VRS. Iterating on Variable Rate Shading in Gears Tactics Jacob Nelson, Programmer at Disbelief Cam McRae, Technical Director at The Coalition Gears Tactics is a fast-paced, turn-based strategy game set in the universe of one of gaming’s most-acclaimed franchises– Gears of War. One of The Coalition’s primary goals on Gears Tactics was to reach a wider audience on PC by reducing the hardware barrier to entry. This led us to look for new solutions to improve performance that didn’t entirely involve sacrificing visual quality. We landed on Variable Rate Shading (VRS), which gave a generous performance boost with minimal quality loss. In this article we’ll detail our process of iterating on VRS to achieve a good balance of quality vs. performance. Availability of VRS is based on hardware. To support our goal of reducing the hardware barrier to entry, we wanted to include the broadest range of hardware in our VRS implementation. For this reason, our implementation is exclusively Tier 1, which lets us set a shading rate per draw. For more background on VRS, refer to this blog post. Because our usage of VRS did not extend past tier 1 we needed to determine which rendering passes (Gears Tactics uses Unreal Engine 4) could support VRS, as well create new ways to determine when to reduce the shading rate. We found we could benefit from VRS throughout the renderer, including some full screen draws. Evaluating the Rendering Pipeline In our initial investigation, the base pass and translucency stood out as large potential for gains compared to most other passes. We enabled VRS for all draw calls in the base pass and the translucency pass and immediately noticed the artifacts were too severe in the translucency pass. We determined our next step would need to be evaluating which passes had the most promise for gains from VRS and then testing multiple shading rates on those passes and excluding any that had artifacts. Passes that relied on accurate pixel information were one cause of these artifacts. To figure out which had the most promise, we enabled VRS on all passes and then narrowed down in PIX the areas that showed a benefit. These areas were: Composition after lighting: Lighting for subsurface scattering materials. Filter Translucent Volumes: Smooths out translucent meshes within a volume to prevent aliasing issues. Light attenuation: Attenuation is the outer bounds when calculating the falloff of a given light. This pass iterates on shadowed lights with this falloff in mind to render shadow projections. Light composition tasks (PreLighting): Responsible for screen space ambient occlusion and decals. Light Shaft Bloom: Bloom effect generated from light shaft rendering. Screen Space Reflections: The process to re-use screen-space information for creating reflections. SSR Temporal AA: Anti-aliasing for the results of the Screen Space Reflections pass. Direct Deferred Lighting: Responsible for rendering any direct lights to the scene color buffer. Next, we examined the severity of artifacts produced in those passes by VRS. This process was fairly time consuming as Gears Tactics has multiple biomes with multiple types of weather conditions. We could have good results in a city during the day, but artifacts in a sand environment with various foliage or at night in the rain. The following passes proved to have a performance improvement from VRS, but had too many artifacts to be worth using: Translucency Deferred Lighting Composition after lighting (Example showing before and after of Composition after lighting at a coarse pixel size of 4x4) After excluding the passes where VRS is unusable due to artifacts, we began to work on adjusting the shading rate based on other factors. As a first adjustment, we excluded dynamic and opacity masked objects. Dynamic objects cast a fully dynamic shadow. These key objects would often lose important details when VRS was being applied and were typically more noticeable throughout the scene. VRS applied to an opacity mask would often result in a great loss of detail. Dynamic objects became more obvious when motion was applied. Objects with a pixel depth offset did not render correctly, so those were excluded as well. (VRS being applied to a collection of masked objects causing an unacceptable amount of loss of detail) Dynamic Techniques We now had a good set of rendering passes that could have VRS applied. However, broadly applying the lowest shading rate (4x4 or 2x2 depending on hardware support) across these passes would lead to a very noticeable quality loss. This led us to investigate dynamic techniques that would change the shading rate depending on how the particular pass was being drawn. Our investigation led to three distinct shading techniques. We applied each technique immediately before each mesh was drawn in key passes. The three techniques were object sizing, depth of field masking, and “Fog of War” masking: Object Sizing The objective for this dynamic factor was to apply a high amount of VRS on any mesh that was extremely small in world space. Object sizing is a quick condition that checks if the mesh size is below a small threshold. If it is, then the shading rate is scaled proportional to the size of the mesh. Depth of Field Masking Depth of field was not enabled by default during gameplay for Gears Tactics for most GPUs, but it was enabled during cutscenes. This technique determined the amount of blur a mesh will have once depth of field was finally applied during post processing. Because detail is intentionally lost from blurring the mesh, we were safe in applying a low shading rate on all the key passes before depth of field. (Meshes highlighted in red have the coarsest pixel size due to the depth of field) “Fog of War” Masking Gears Tactics used a ‘Fog of War’ system to obscure portions of the battlefield. We took advantage of the “Fog of War” to mask a reduced shading rate. This technique determined how far a given mesh was shrouded in the “Fog of War” and proportionally lowered the shading rate. (As the “Fog of War” intensity grows, so does the intensity of VRS) Quality vs. Performance With our dynamic techniques in place on the set of rendering passes that cleanly supported VRS, we then looked at tuning the dynamic ranges and the shading rates that would be used. We found that we could get upwards of 30% performance gains by aggressively lowering the shading rate, but the quality loss was more noticeable than we wanted. The next step was to hand tune the shading rate for each pass across multiple biomes, evaluating performance gain and quality with each change. While doing this, we realized we could break up the VRS setting into two distinct tiers. The “On” setting would have the least noticeable impact on quality by leaving VRS off for some rendering passes and limiting the shading rate reduction, while the “Performance” setting would make some minor visual quality trade-offs for a larger performance gain by using lower shading rates. This had the added benefit of cleanly supporting shading rates that required extra capabilities. Pixel sizes above 2 require additional shading rate support from the hardware, so any pass that could use 2x4, 4x2, or 4x4 shading rates are limited to the “Performance” tier. Performance Results We focused our performance testing on Intel Gen 11 and Intel Xe hardware, with the support of Intel, as well as NVIDIA Turing hardware. There were similar performance gains on all the supported hardware. Testing Hardware Operating System: Windows 10 Pro 64-bit (10.0, Build 18362) (18362.19h1_release.190318-1202) Processor: Intel(R) Core(TM) i9-9900X CPU @ 3.50GHz (20 CPUs), ~3.5GHz Memory: 98304MB RAM Card name: NVIDIA GeForce RTX 2080 SUPER All tests run at Ultra settings with 4K resolution VRS Setting Frametime (ms) GPU Savings (ms) GPU Savings (%) VRS Off 23.3ms - - VRS On 20.9ms 2.4 10.3 VRS Performance 18.9ms 4.4 18.9 Rendering Pass VRS “On” GPU Savings “On” (ms) VRS “Performance” GPU Savings “Performance” (ms) Base Pass E1X1 - E2X1 0.6 SSR Temporal AA E2X1 0.3 E2X2 0.5 Filter Translucent Volumes E2X2 <0.1 E4X4 0.1 Light Attenuation E2X1 0.4 E2X2 0.5 Light Composition Tasks (PreLighting) E2X1 1.5 E2X1 1.5 Light Shaft Bloom E2X2 <0.1 E4X4 0.1 Screen Space Reflections E1X1 - E1X2 0.4 Shading Rate Technique Max Pixel Size “On” GPU Savings “On” (ms) Max Pixel Size “Performance” GPU Savings “Performance” (ms) Object Sizing E2X2 0.1 E4X4 0.1 Depth of Field Masking (Only used in Cutscenes) E2X2 0.2 E4X4 0.3 Fog of War Masking E2X2 0.3 E4X4 0.5 All Advanced Techniques E2X2 0.4 (Gameplay) 0.6 (Cutscenes) E4X4 0.7 (Gameplay) 1.0 (Cutscenes) All Techniques and Passes E2X2 2.4 (Gameplay) 2.6 (Cutscenes) E4X4 4.4 (Gameplay) 4.7 (Cutscenes) (All techniques enabled with “VRS On”) (All techniques enabled with “VRS Performance”) VRS Setting E1x1 Draw Calls E1x2 Draw Calls E2x1 Draw Calls E2x2 Draw Calls E2x4 Draw Calls E4x2 Draw Calls E4x4 Draw Calls VRS Off 4100 - - - - - - VRS On 3430 260 150 260 - - - VRS Performance 2791 30 1047 11 27 8 258 Expanding on VRS It is possible to obtain further benefits by expanding on our previously discussed techniques to apply VRS. Further Masking Techniques More masking techniques could obtain an increased amount of benefit from VRS or further reduce artifacts. These methods can vary depending on instances where areas are obscured or blurred during gameplay: Motion Blur: Cameras that utilize fast moving scenes can easily make use of VRS on most meshes when moving quickly with motion blur enabled. Particles: Meshes hidden behind a thick shroud of particles could be used to mask high intensity VRS. Dynamic Variable Rate Shading Similar to Dynamic Resolution Scaling, a possible improvement is to scale the amount of VRS depending on the frame rate to minimize the amount of time needed to use the feature. This would be a separate tracking system from Dynamic Resolution Scaling that might need to change at different frequencies to avoid amplifying artifacts. Dynamic Resolution Scaling Gears Tactics offers both VRS and Dynamic Resolution Scaling for optimization, but these features do not play well with each other. This is because when Dynamic Resolution Scaling changes the resolution scale, the VRS artifacts change appearance. This can cause the player to become aware of the VRS artifacts and the resolution scale changes when both would have otherwise gone unnoticed. One way to allow both at the same time could be to monitor the rate of change of Dynamic Resolution Scaling and disable the use of VRS until it has stabilized. In Conclusion We found that skillful use of VRS tier one enabled us to achieve significant performance gains across a wide range of hardware with a minimal impact on visual quality.