Mobile devices are typically bandwidth bound which means we need to do as few texture samples as possible.
They typically use TBDR GPUs which means that all rendering takes place on special optimized tiles. As a side effect, reading back memory from tile to VRAM is really slow, especially on Mali devices.
This commit uses a technique where you do a small blur while downsampling, and then another small blur while upsampling to get really high quality glow. While this doesn't reduce the renderpass count very much, it does reduce the texture read bandwidth by almost 10 times. Overall glow was more texture-read bound than memory write, bound, so this was a huge win.
A side effect of this new technique is that we can gather the glow as we upsample instead of gathering the glow in the final tonemap pass. Doing so allows us to significantly reduce the cost of the tonemap pass as well.
This work is a heavily refactored and rewritten from TheForge's initial
code.
TheForge's original code had too many race conditions and was
fundamentally flawed as it was too easy to incur into those data races
by accident.
However they identified the proper places that needed changes, and the
idea was sound. I used their work as a blueprint to design this work.
This PR implements:
- Introduction of UMA buffers used by a few buffers
(most notably the ones filled by _fill_instance_data).
Ironically this change seems to positively affect PC more than it does
on Mobile.
Updates D3D12 Memory Allocator to get GPU_UPLOAD heap support.
Metal implementation by Stuart Carnie.
Co-authored-by: Stuart Carnie <stuart.carnie@gmail.com>
Co-authored-by: TheForge team
This commit changes adjustments to behave as follows for all rendering configurations:
- Apply brightness to linear-encoded values, preventing contrast, saturation, and hue from being affected.
- Apply contrast to perceptually uniform (nonlinear sRGB-encoded) values, matching existing behavior when HDR 2D is disabled and producing optimal visual quality.
- Apply saturation with even color channel weights. This causes brightness of certain colors to change, but matches existing behavior when HDR 2D is disabled.
Adjustments are applied after glow and tonemapping to match existing behavior.
Introduce a specialised `texture_get_data` for `RenderDeviceDriver`,
which can retrieve the texture data from the GPU driver for shared
textures (`TEXTURE_USAGE_CPU_READ_BIT`).
Closes#108115
Additionally, change the minimum `tonemap_white` parameter to `1.0`; users can increase `tonemap_exposure` for a similar effect to decreasing `tonemap_white` below `1.0`.
Co-authored-by: Hei <40064911+Lielay9@users.noreply.github.com>
Co-authored-by: Hugo Locurcio <hugo.locurcio@hugo.pro>