Skip to content

Document function signatures in C ABI #161

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 5 commits into from
Jan 20, 2021
Merged

Document function signatures in C ABI #161

merged 5 commits into from
Jan 20, 2021

Conversation

tlively
Copy link
Member

@tlively tlively commented Jan 13, 2021

Fixes #160.

@tlively tlively requested review from RReverser and dschuff January 13, 2021 20:08
BasicCABI.md Outdated
### Function signatures

Scalar types passed as arguments are passed via WebAssembly function parameters and returned via WebAssembly function
results of their corresponding Wasm value types. `long double`, which has no corresponding Wasm value type, is passed
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

General nitpick: I'd format this as a list / somehow split up into lines for easier readability.

@tlively
Copy link
Member Author

tlively commented Jan 13, 2021

@RReverser I made it a table, ptal.

@sunfishcode
Copy link
Member

Would it make sense to mention the upcoming multivalue ABI here, so that people building tools that know about C ABIs are aware that this change is coming?

BasicCABI.md Outdated

Types can be passed directly via WebAssembly function parameters or indirectly
via a pointer parameter that points to the value in memory. Similarly, types can
either be returned directly from WebAssembly functions or returned indirectly
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For byval IIRC LLVM currently guarantees that the callee will not modify the buffer. Do we want to make a similar guarantee in the ABI? For sret, it's probably worth making explicit that the caller must allocate the return buffer (even though it's sort of implied since the caller passes a pointer).

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For byval IIRC LLVM currently guarantees that the callee will not modify the buffer

Hmm, are you sure? When I compile code like:

struct A {
    int x;
    int y;
};

void foo(struct A a) {
    a.x = 42;
}

it produces:

foo:                                    # @foo
        local.get       0
        i32.const       42
        i32.store       0
        end_function

which seems to modify the caller's buffer.

(Godbolt link: https://clang.godbolt.org/z/crKarv)

Copy link
Member

@dschuff dschuff Jan 14, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh you are right; the langref says that a hidden copy is created "between the caller and the callee" so that the caller's copy (the one represented in the IR) can't be modified but doesn't say how.
Our implementation creates the copy in the caller's frame, which is simple and makes sense.
So the machine ABI should probably specify that the caller may modify the copy that is passed to it, and therefore the callee must create a copy (or otherwise ensure that any modification doesn't affect the calling code, "as-if" there were a copy; maybe that part goes without saying?)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

that the caller may modify the copy that is passed to it, and therefore the callee must create a copy

I think you mean the other way around (callee...caller)?

Is this part of the ABI too? I thought it was just general semantics of the C language, rather than a machine-specific thing.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

err... yes, the other way around. (callee may modify, so the caller creates a copy).

It is the semantics of the C language, but how it happens could vary between machines. It could just as easily be that the caller just passes a pointer to its own copy, and then the ABI requires that the callee not modify that copy (forcing it to make its own copy if necessary). Either way ABI means that the caller's copy is not modified.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Interesting, makes sense.

@RReverser
Copy link
Member

the upcoming multivalue ABI here, so that people building tools that know about C ABIs are aware that this change is coming?

I suspect multivalue ABI will be an additional ABI, rather than a replacement for the current one? (we'll still need this one for compatibility with older browsers / engines)

@sunfishcode
Copy link
Member

Multivalue is standardized and widely supported now, so I'm hoping multivalue will become the default, even if we also retain the pre-multivalue ABI for compatibility.

@RReverser
Copy link
Member

Multivalue is standardized and widely supported now, so I'm hoping multivalue will become the default, even if we also retain the pre-multivalue ABI for compatibility.

Right, that's my point - we'll still retain this ABI for compatibility.

Given that it already has "As mentioned in README, it's not the only possible C ABI. It is the ABI that the clang/LLVM WebAssembly backend is currently using, [...]" in the disclaimer, that seems sufficient and we can reword once we create a doc for multivalue C ABI.

@dschuff
Copy link
Member

dschuff commented Jan 14, 2021

This section should probably also mention varargs. They work similarly (https://github.com/llvm/llvm-project/blob/main/llvm/lib/Target/WebAssembly/WebAssemblyISelLowering.cpp#L950) to byval: the caller assigns each argument an offset according to its alignment and size, creates a buffer in its stack frame (at compile time), and then at runtime copies each argument into its place and passes a pointer to the buffer (as the last argument).

@RReverser
Copy link
Member

@dschuff Good idea, yeah. So, if I understand correctly, varargs are laid out as if they were fields in an extra struct passed to the function?

@tlively
Copy link
Member Author

tlively commented Jan 20, 2021

How does that extra detail look, @RReverser?

@RReverser
Copy link
Member

How does that extra detail look, @RReverser?

LGTM. I, obviously, can't speak for its correctness / completeness, especially in light of me being confused by the 3 * sizeof(varargs) allocation that Clang seems to be doing, but if you think this description is sufficiently correct, I'm happy to go with it.

Co-authored-by: Ingvar Stepanyan <[email protected]>
@tlively tlively merged commit bc9f8af into master Jan 20, 2021
@RReverser RReverser deleted the tlively-patch-1 branch January 20, 2021 03:19
@RReverser
Copy link
Member

@tlively Thanks for documenting this!

BasicCABI.md Outdated
singleton struct or union[2] | direct | direct |
other struct or union | indirect | indirect |

[1] `long long double` is passed directly as two `i64` values.
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why was this changed to refer to long long double rather than long double? long long double is not defined in the table of primitive types.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this should have been just long double. long double in LLVM is a 128-bit FP value, and the math for it is software-emulated since there is no fp128 scalar type in wasm. So it would make sense to pass as a pair of i64s (although I haven't double-checked). AFAIK there is no long long double.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Document rules for complex params passed to / returned from functions in C ABI
5 participants