Mobile devices are typically bandwidth bound which means we need to do as few texture samples as possible.
They typically use TBDR GPUs which means that all rendering takes place on special optimized tiles. As a side effect, reading back memory from tile to VRAM is really slow, especially on Mali devices.
This commit uses a technique where you do a small blur while downsampling, and then another small blur while upsampling to get really high quality glow. While this doesn't reduce the renderpass count very much, it does reduce the texture read bandwidth by almost 10 times. Overall glow was more texture-read bound than memory write, bound, so this was a huge win.
A side effect of this new technique is that we can gather the glow as we upsample instead of gathering the glow in the final tonemap pass. Doing so allows us to significantly reduce the cost of the tonemap pass as well.
Add missing <winnt.h> include in windows/lang_table.h
Add missing typedefs.h and rid.h include in godot_constraint_3d.h
Add missing <type_traits> include in iterable.h
Add missing forward declarations in rendering_shader_library.h
This can result in low fog densities being a bit brighter, which may need slight adjustment in your scenes.
In exchange, volumetrics now blend smoothly together with the scene and regular fog.
Fixes#101514
This work is a heavily refactored and rewritten from TheForge's initial
code.
TheForge's original code had too many race conditions and was
fundamentally flawed as it was too easy to incur into those data races
by accident.
However they identified the proper places that needed changes, and the
idea was sound. I used their work as a blueprint to design this work.
This PR implements:
- Introduction of UMA buffers used by a few buffers
(most notably the ones filled by _fill_instance_data).
Ironically this change seems to positively affect PC more than it does
on Mobile.
Updates D3D12 Memory Allocator to get GPU_UPLOAD heap support.
Metal implementation by Stuart Carnie.
Co-authored-by: Stuart Carnie <stuart.carnie@gmail.com>
Co-authored-by: TheForge team
This commit changes adjustments to behave as follows for all rendering configurations:
- Apply brightness to linear-encoded values, preventing contrast, saturation, and hue from being affected.
- Apply contrast to perceptually uniform (nonlinear sRGB-encoded) values, matching existing behavior when HDR 2D is disabled and producing optimal visual quality.
- Apply saturation with even color channel weights. This causes brightness of certain colors to change, but matches existing behavior when HDR 2D is disabled.
Adjustments are applied after glow and tonemapping to match existing behavior.
RenderingDevice::screen_get_framebuffer_format should return a value of
type RenderingDevice::FramebufferFormatID which is an alias of int64_t
but it returns Error::FAILED with a value of 1. The compiler does not
complain because both types are integers but 1 corresponds to a valid
FramebufferFormatID, meaning a certain failure condition is missed.
This commit changes it to the correct value, INVALID_ID.
Introduce a specialised `texture_get_data` for `RenderDeviceDriver`,
which can retrieve the texture data from the GPU driver for shared
textures (`TEXTURE_USAGE_CPU_READ_BIT`).
Closes#108115