-
Notifications
You must be signed in to change notification settings - Fork 74
Document function signatures in C ABI #161
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
BasicCABI.md
Outdated
### Function signatures | ||
|
||
Scalar types passed as arguments are passed via WebAssembly function parameters and returned via WebAssembly function | ||
results of their corresponding Wasm value types. `long double`, which has no corresponding Wasm value type, is passed |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
General nitpick: I'd format this as a list / somehow split up into lines for easier readability.
@RReverser I made it a table, ptal. |
Would it make sense to mention the upcoming multivalue ABI here, so that people building tools that know about C ABIs are aware that this change is coming? |
BasicCABI.md
Outdated
|
||
Types can be passed directly via WebAssembly function parameters or indirectly | ||
via a pointer parameter that points to the value in memory. Similarly, types can | ||
either be returned directly from WebAssembly functions or returned indirectly |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For byval IIRC LLVM currently guarantees that the callee will not modify the buffer. Do we want to make a similar guarantee in the ABI? For sret, it's probably worth making explicit that the caller must allocate the return buffer (even though it's sort of implied since the caller passes a pointer).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For byval IIRC LLVM currently guarantees that the callee will not modify the buffer
Hmm, are you sure? When I compile code like:
struct A {
int x;
int y;
};
void foo(struct A a) {
a.x = 42;
}
it produces:
foo: # @foo
local.get 0
i32.const 42
i32.store 0
end_function
which seems to modify the caller's buffer.
(Godbolt link: https://clang.godbolt.org/z/crKarv)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh you are right; the langref says that a hidden copy is created "between the caller and the callee" so that the caller's copy (the one represented in the IR) can't be modified but doesn't say how.
Our implementation creates the copy in the caller's frame, which is simple and makes sense.
So the machine ABI should probably specify that the caller may modify the copy that is passed to it, and therefore the callee must create a copy (or otherwise ensure that any modification doesn't affect the calling code, "as-if" there were a copy; maybe that part goes without saying?)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
that the caller may modify the copy that is passed to it, and therefore the callee must create a copy
I think you mean the other way around (callee...caller)?
Is this part of the ABI too? I thought it was just general semantics of the C language, rather than a machine-specific thing.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
err... yes, the other way around. (callee may modify, so the caller creates a copy).
It is the semantics of the C language, but how it happens could vary between machines. It could just as easily be that the caller just passes a pointer to its own copy, and then the ABI requires that the callee not modify that copy (forcing it to make its own copy if necessary). Either way ABI means that the caller's copy is not modified.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Interesting, makes sense.
I suspect multivalue ABI will be an additional ABI, rather than a replacement for the current one? (we'll still need this one for compatibility with older browsers / engines) |
Multivalue is standardized and widely supported now, so I'm hoping multivalue will become the default, even if we also retain the pre-multivalue ABI for compatibility. |
Right, that's my point - we'll still retain this ABI for compatibility. Given that it already has "As mentioned in README, it's not the only possible C ABI. It is the ABI that the clang/LLVM WebAssembly backend is currently using, [...]" in the disclaimer, that seems sufficient and we can reword once we create a doc for multivalue C ABI. |
This section should probably also mention varargs. They work similarly (https://github.com/llvm/llvm-project/blob/main/llvm/lib/Target/WebAssembly/WebAssemblyISelLowering.cpp#L950) to byval: the caller assigns each argument an offset according to its alignment and size, creates a buffer in its stack frame (at compile time), and then at runtime copies each argument into its place and passes a pointer to the buffer (as the last argument). |
@dschuff Good idea, yeah. So, if I understand correctly, varargs are laid out as if they were fields in an extra struct passed to the function? |
How does that extra detail look, @RReverser? |
LGTM. I, obviously, can't speak for its correctness / completeness, especially in light of me being confused by the |
Co-authored-by: Ingvar Stepanyan <[email protected]>
@tlively Thanks for documenting this! |
BasicCABI.md
Outdated
singleton struct or union[2] | direct | direct | | ||
other struct or union | indirect | indirect | | ||
|
||
[1] `long long double` is passed directly as two `i64` values. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why was this changed to refer to long long double
rather than long double
? long long double
is not defined in the table of primitive types.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this should have been just long double
. long double
in LLVM is a 128-bit FP value, and the math for it is software-emulated since there is no fp128 scalar type in wasm. So it would make sense to pass as a pair of i64s (although I haven't double-checked). AFAIK there is no long long double
.
Fixes #160.