Mobile devices are typically bandwidth bound which means we need to do as few texture samples as possible.
They typically use TBDR GPUs which means that all rendering takes place on special optimized tiles. As a side effect, reading back memory from tile to VRAM is really slow, especially on Mali devices.
This commit uses a technique where you do a small blur while downsampling, and then another small blur while upsampling to get really high quality glow. While this doesn't reduce the renderpass count very much, it does reduce the texture read bandwidth by almost 10 times. Overall glow was more texture-read bound than memory write, bound, so this was a huge win.
A side effect of this new technique is that we can gather the glow as we upsample instead of gathering the glow in the final tonemap pass. Doing so allows us to significantly reduce the cost of the tonemap pass as well.
This commit changes adjustments to behave as follows for all rendering configurations:
- Apply brightness to linear-encoded values, preventing contrast, saturation, and hue from being affected.
- Apply contrast to perceptually uniform (nonlinear sRGB-encoded) values, matching existing behavior when HDR 2D is disabled and producing optimal visual quality.
- Apply saturation with even color channel weights. This causes brightness of certain colors to change, but matches existing behavior when HDR 2D is disabled.
Adjustments are applied after glow and tonemapping to match existing behavior.
Additionally, change the minimum `tonemap_white` parameter to `1.0`; users can increase `tonemap_exposure` for a similar effect to decreasing `tonemap_white` below `1.0`.
Co-authored-by: Hei <40064911+Lielay9@users.noreply.github.com>
Co-authored-by: Hugo Locurcio <hugo.locurcio@hugo.pro>
This is necessary to ensure the SpvCapabilityMultiView is included in
the SPIR-V, informing downstream transpilers, like Metal, that it should
enable multi-view capabilities in the generated Metal shader source.
This change improves performance of the AgX tonemapper by allowing two matrix multiplications to be combined into one. This comes at the cost of loss of color information that could be correctly interpreted as positive RGB values in the Rec. 2020 color space. Additionally, an insignificant amount of error is intentionally introduced to the input color value to prevent the need for a second max function call before log2. The final negative color clipping has been removed to allow the tonemapper to return negative RGB values, similar to other tonemappers in Godot.
This changes the polynomial function so that a lower input always results in a lower output and vice-versa. Additionally, the new function returns a value that is much closer to 1.0 when given an input of 1.0.
Technical implementation notes:
- Moved linearization step to before the outset matrix is applied and
changed polynomial contrast curve approximation.
- This does *not* implement Blender's chroma rotation to address hue shift.
This hue rotation was found to have a significant performance impact.
- Improved performance by combining the AgX outset matrix with the Rec 2020 matrix.
Co-authored-by: Allen Pestaluky <allenpestaluky@gmail.com>
Co-authored-by: Clay John <claynjohn@gmail.com>
- Buffers changing their usage are no longer treated as write usage unless the API requires it.
- Draw lists are not treated as being dependent on each other if their regions do not intersect despite both being write commands.
- Particles were tweaked to use different unused buffers to reduce dependencies.
- Use negative clip space values to fix reversed rotations in reflections
- Use constant -z view vector when raymarching to fix perspective in reflections
Using 2.2.7.dev217+g10c2abcf.
Had to add `colour` to the ignore list as we used it as an alias/keyword for the
documentation of color-related APIs.
Also ignore recommendations to change `thirdparty` to either `third-party` or
`third party`, which are correct but we use the former fairly consistently.
Adds a new system to automatically reorder commands, perform layout transitions and insert synchronization barriers based on the commands issued to RenderingDevice.