This work is a heavily refactored and rewritten from TheForge's initial
code.
TheForge's original code had too many race conditions and was
fundamentally flawed as it was too easy to incur into those data races
by accident.
However they identified the proper places that needed changes, and the
idea was sound. I used their work as a blueprint to design this work.
This PR implements:
- Introduction of UMA buffers used by a few buffers
(most notably the ones filled by _fill_instance_data).
Ironically this change seems to positively affect PC more than it does
on Mobile.
Updates D3D12 Memory Allocator to get GPU_UPLOAD heap support.
Metal implementation by Stuart Carnie.
Co-authored-by: Stuart Carnie <stuart.carnie@gmail.com>
Co-authored-by: TheForge team
PR #100062 introduced BUFFER_USAGE_DEVICE_ADDRESS_BIT.
However it did so by adding a boolean to uniform_buffer_create(), called
"bool p_enable_device_address".
This makes maintaining backwards compatibility harder because I am
working on another feature that would require introducing yet another
bit flag.
This would save us the need to add fallback routines when the feature I
am working on makes it to Godot 4.5.
Even if my feature doesn't make it to 4.5 either, this PR makes the
routine more future-proof.
This PR also moves STORAGE_BUFFER_USAGE_DEVICE_ADDRESS into
BUFFER_CREATION_DEVICE_ADDRESS_BIT, since it's an option available to
both storage and uniforms.
This PR also moves the boolean use_as_storage into
BUFFER_CREATION_AS_STORAGE.
PR #90993 added several debugging utilities.
Among them, advanced memory tracking through the use of custom
allocators and VK_EXT_device_memory_report.
However as issue #95967 reveals, it is dangerous to leave it on by
default because drivers (or even the Vulkan loader) can too easily
accidentally break custom allocators by allocating memory through std
malloc but then request us to deallocate it (or viceversa).
This PR fixes the following problems:
- Adds --extra-gpu-memory-tracking cmd line argument
- Adds missing enum entries to
RenderingContextDriverVulkan::VkTrackedObjectType
- Adds RenderingDevice::get_driver_and_device_memory_report
- GDScript users can easily check via print(
RenderingServer.get_rendering_device().get_driver_and_device_memory_report()
)
- Uses get_driver_and_device_memory_report on device lost for appending
further info.
Fixes#95967
Features:
- Debug-only tracking of objects by type. See
get_driver_allocs_by_object_type et al.
- Debug-only Breadcrumb info for debugging GPU crashes and device lost
- Performance report per frame from get_perf_report
- Some VMA calls had to be modified in order to insert the necessary
memory callbacks
Functionality marked as "debug-only" is only available in debug or dev
builds.
Misc fixes:
- Early break optimization in RenderingDevice::uniform_set_create
============================
The work was performed by collaboration of TheForge and Google. I am
merely splitting it up into smaller PRs and cleaning it up.
Adds a new system to automatically reorder commands, perform layout transitions and insert synchronization barriers based on the commands issued to RenderingDevice.
Credit and thanks to @bruzvg for multiple build fixes, update of 3rd-party items and MinGW support.
Co-authored-by: bruvzg <7645683+bruvzg@users.noreply.github.com>