Some times you think you have everything working, then you go to call part of your program and it turns out, no, it’s not working.
That is the predicament I find myself in now. While I have been successful in getting UE5 to load my pmo_library.dll, it turns out there’s a strange unknown bug in flecs when it’s linked into a shared library. At first I thought it was a problem with the unreal build system and how everything was linking, but I tried to run my tests in windows and only the flecs test failed.
This led me down a rabbit hole that has consumed the past 4-5 days. It appears if you link flecs into a shared library, random bugs occur when attempting to import my flecs modules. When I statically link flecs into my shared library I get one crash. When I dynamically link it I get another crash.
If I statically link my library and statically link flecs, then I get no crashes. But I can’t do this because I can’t statically link my library for UE5, it requires it be a shared (DLL) library.
I’ve opened an issue in the flecs discord, and hope someone can help me out. While I wait, I continue to debug.
Let’s take a look at my process so far. First I updated to the latest version (I was 2 minor releases behind). Then I wanted to trace the differences when you statically link everything, vs create shared libraries.
Here were my results.
Binary with statically linked flecs (flecs_static) and my statically linked library:
- Call
ecs.import<movement::module>()
from main binary - Calls
impl.hpp:13 (do_import)
- This calls flecs_cpp.c:348
ecs_cpp_component_register_explicit
once with my module componentmovement::module
(type.size = 1, alignment = 1) so it passes assertions impl.hpp:13 do_import
callsworld.emplace<T>(world)
- Emplace calls:
inline void emplace(world_t *world, flecs::entity_t entity, flecs::id_t id, Args&&... args) {
ecs_assert(_::cpp_type<T>::size() != 0, ECS_INVALID_PARAMETER, NULL);
T& dst = *static_cast<T*>(ecs_emplace_id(world, entity, id));
FLECS_PLACEMENT_NEW(&dst, T{FLECS_FWD(args)...}); // <-- some c++ templating magic
- “some c++ templating magic” calls my
movement::module
constructor - Program works as expected
Binary with dynamically linked flecs (flecs) and my dynamically linked library:
- Call
ecs.import<movement::module>()
from main binary - Calls
impl.hpp:13 (do_import)
- This calls
ecs_cpp_component_register_explicit
once with my module componentmovement::module
(type.size = 1, alignment = 1) so it passes assertions impl.hpp:13 (do_import)
callsworld.emplace<T>(world);
- emplace calls flecs_cpp.c:400 (ecs_component_init):
inline void emplace(world_t *world, flecs::entity_t entity, flecs::id_t id, Args&&... args) {
ecs_assert(_::cpp_type<T>::size() != 0, ECS_INVALID_PARAMETER, NULL);
T& dst = *static_cast<T*>(ecs_emplace_id(world, entity, id));
FLECS_PLACEMENT_NEW(&dst, T{FLECS_FWD(args)...}); // <-- some c++ templating magic
- “some c++ templating magic” jumps into a number of unknown functions inside my library.dll until finally going back into
ecs_cpp_component_register_explicit
:
flecs.dll!ecs_cpp_component_register_explicit(ecs_world_t * world, unsigned __int64 s_id, unsigned __int64 id, const char * name, const char * type_name, const char * symbol, unsigned __int64 size, unsigned __int64 alignment, bool is_component, bool * existing_out) Line 400 (c:\Users\isaac\Documents\Unreal Projects\pmoclient\Plugins\PMO\Source\ThirdParty\pmo\build\_deps\flecs-src\src\addons\flecs_cpp.c:400)
pmo_library.dll!00007ffec7084415() (Unknown Source:0)
pmo_library.dll!00007ffec70829d7() (Unknown Source:0)
pmo_library.dll!00007ffec707f016() (Unknown Source:0)
pmo_library.dll!00007ffec707de6c() (Unknown Source:0)
- Now my module
name
is prefixed with::
such as::movement::module
in theecs_cpp_component_register_explicit
as it tries to register or check if it’s registered again. entity.c:1888 (ecs_component_init)
which callsflecs_check_component
attempts to check if the const_ptr of the component, matches the size/alignment but fails because size is 0.ecs_abort(ECS_INVALID_COMPONENT_SIZE, path);
is called becauseptr->size (1) != size (0)
- :crash:
fatal: entity.c: 1878: movement.module (INVALID_COMPONENT_SIZE)
Clearly there is some oddness going on with the templating magic in step 6 between statically and dynamically linked versions. I just can not for the life of me figure out what. It does appear to be doing some sort of secondary look up when it’s crash-y. This makes me think there is something wrong with the component registration between DLL/binary, but it’s not clear what.
To remove as many variables as possible, I created a minimal reproducible project which I highly recommend you do anytime you are investigating oddities such as this. Also it helps when filing bug reports for the maintainers! Speaking of issues, I did come across this bug issue which seems some what similar in that there are disparities between when running as a DLL.
So if you’ve been wondering why I haven’t posted any updates lately it’s because I’m quite stuck! I really hope I don’t have to get rid of flecs, so I’ll give this another few weeks of keyboard smashing to see if I, or the maintainers, can resolve the problem.