A lot of what has been done in clang for atomics historically is just the legacy of lack of full support in IR. Much of that has been addressed over time. So at this point, I guess I’m unconvinced that there’s enough complexity that needs to remain in a frontend, which can actually be done in a frontend-independent manner, to really be worthwhile abstracting.
The decision of when Clang emits a direct libcall is now simple: not power-of-2-sized, or more than 16-bytes in size, then emit a libcall directly. The non-power-of-2 is just an IR validity limit, which we could potentially relax if we wished.
In other cases, we emit direct IR, just potentially with a bitcast to an integer type. Bitcasting to an integer in the frontend doesn’t seem onerous – but we could also potentially remove that IR requirement.
Some of the complexity in Clang is to pad out an (e.g.) 3-byte struct to 4 bytes, so that it can use lock-free atomics. I think this needs to be frontend-specific – I don’t know how that could be reasonably abstracted to frontend-agnostic.
The question you raise about clang’s libcall generation vs llvm’s libcall generation does seem like an interesting problem, but to me, that seems like a generic issue – not specific to atomic libcalls – which therefore needs a generic answer.