
Written by Nicholas Lawson, member of the O3DE Technical Community
Address Sanitizer (ASAN) changes
As part of fixing crashes and bugs, a powerful tool at our disposal is the Address Sanitizer technology, which works in clang, gcc, and Microsoft Visual Studio. https://learn.microsoft.com/en-us/cpp/sanitizers/asan?view=msvc-170
It helps find memory issues which typically cause mystery crashes, deadlocks, corruption, and other subtle problems where the source of the issue might be disconnected (in time and code place) from what the unexpected behavior is. Because memory gets corrupted, you might find mystery bug reports in systems that seem perfectly fine, because the actual problem was caused by a completely different system corrupting the memory, but not being caught in the act.
O3DE Supports using the Address Sanitizer via the CMake flag LY_BUILD_WITH_ADDRESS_SANITIZER. Setting this to a non-false value will do the following:
- Turn off the use of the High Performance Heap Allocator (HPHA) custom allocator and pipe all SystemAllocator memory allocations directly to operating system malloc (aligned versions), so that tools like ASAN can see the allocations at a more granular level.
- Turn on the use of ASAN, with all of the default features enabled.
It is not recommended that you try to actually run anything large with O3DE ASAN enabled, as some of the features of ASAN makes it retain instead of release memory, even if the application specifies it should be freed, to do things like check for use-after-free errors. Trying to load the editor would require quite a lot of RAM but is not impossible to do.
However, we have thousands and thousands of unit and integration tests which can run standalone, using the google test framework. So the best way to use ASAN here would be to switch to a debug build, enable ASAN and then run individual tests, which is what I did.
As part of changes for this point release, I made the following changes
- I made sure that the allocators we use such as the System Allocator defer to malloc properly, when ASAN is active. I also made sure that it still correctly monitors the actual amount of bytes allocated even if it is not tracking the individual allocations themselves.
- I went through every ASAN error I found in every test in every core library and fixed them. (This means everything under Code/, including tools like AssetProcessor, as well as everything in the Atom gems). I haven’t tested the other gems yet, but intend to.
Several problems were found, but most notably, every problem found by ASAN was a real problem, there were no false positives. We didn’t have to disable or customize ASAN detections on windows. See https://github.com/o3de/o3de/pull/18655 for the details, but here are some highlights that I fixed:
- RapidJSON would overrun a document buffer looking for whitespace using SIMD operations.
- Several cases were found where systems were using wide-version string format functions but using char* buffers as parameters.
- Script Events had a subtle use-after-free bug when Script Event Values were duplicated.
- Windows File IO was converting UTF8 to wchar_t, but was not checking them for utf8-encoding errors.
- Many cases where overzealous use of string_view lead to a use-after-free. Please, developers, string_view is to be used with care, it is not a drop-in substitute for string.
- A googletest death_test was being used completely unnecessarily and causing potential deadlocks
- A pure memory leak from not properly deleting a non-virtual type in the network serializer.
- Use-after-free in GraphCanvas/ScriptCanvas.
All in all, I would highly recommend developers use ASAN in their own projects – write some good tests, and run it through the wringer. When you see ASAN complain about something, be really, really skeptical about claims that it is a false-positive no matter how innocent the code seems.
This does not mean that all possible buffer errors or corruption issues are found, but the core framework tests are fairly comprehensive in terms of coverage, and it’s good to know that the core that everything else is built upon passes its test coverage cleanly.
Performance Changes (Icons)
A developer reported that when a lot of icons are on the screen in the editor, framerate suffers heavily dropping from a steady 60fps down to 10 or 5.
I discovered that the icon rendering was done by creating a draw call for each such icon, inflicting quite a lot of overhead. I made a simple optimization that collects all icons, buckets them by texture, and draws all of them that have the same texture in one draw call. This is a fairly standard optimization, and while there could be a lot more done to make it even faster (such as using geometry shaders to generate quads, or atlassing or other operations to make it so its a single call) the truth is that after the more simple batching operation, icon rendering no longer presents in the profiler as any meaningful time use anymore, and does not seem to impact framerate anymore.
Open 3D Engine resources:
- Connect with us on LinkedIn
- Subscribe to our new quarterly newsletter
- Follow us on WeChat
- Watch our videos on YouTube