[RFC] Yet Another LLVM restrict Support

Yet Another LLVM restrict Support

Abstract

LLVM’s current support for the C restrict qualifier is very limited and incomplete. In fact, LLVM only handles restrict on function arguments (as a noalias attribute). All other uses – such as restrict on local or global variables, struct/union members, or in block scope – are completely ignored. This long-standing gap has been noted in the community (e.g. Jeroen Dobbelaere’s 2019 RFC) and discussed in LLVM dev meetings. For example, a 2017 LLVM developer talk observed that C99’s definition of “based on” for restrict pointers is “completely general”, making the compiler’s task “difficult in mostly non-useful ways”. In short, naïve mapping of restrict to LLVM metadata has proven insufficient, and previous patch attempts (e.g. Hal Finkel’s “local restrict” patches) are stalled, partly because they rely on new intrinsics that inhibit optimizations. We therefore propose a new approach to fully support restrict in LLVM without heavy intrinsics, by encoding lexical scope information into TBAA metadata.

This design addresses two main LLVM challenges: first, Clang IR doesn’t normally record where in the source each pointer was declared; second, the C standard’s “based on” definition is so complex that tracking it exactly in the compiler is very difficult. By encoding scope information directly into metadata, we sidestep expensive interprocedural “based on” tracking. (As a 2017 DevCon slide put it, pointer capturing across calls is “the root of all evil” in restrict analysis – it can make alias decisions undecidable.) Our scope-coding approach is tractable because it only needs local, lexical structure. In the rest of this proposal we explain the C rules for restrict, illustrate the idea, and outline the implementation. All code examples from the original draft are preserved below for concreteness.

Some insights about restrict behavior

The C standard’s restrict rules can be summarized roughly as follows:

  • Let X be an lvalue, and let P be a restrict-qualified pointer initialized to point to X. If a modification of X occurs within the block D is a block where P was declared, then any access to X within this block must be performed exclusively through the pointer P or through expressions based on P.

In plain terms: If an object X is modified in the same block where P was declared, then all further accesses to X in that block must be through P (or something based on P). If the object is never modified in the block, there is no aliasing restriction. We illustrate this with the following examples:

{
    int x;
    x = 42;                 // modification of x in the outer block
    int y;
    {                       // inner block
        int *restrict px = &x;
        y = *px + x;        // valid: x was modified in the outer block
    }
}

Contrast with:

{
    int x;
    x = 42;                 // modification of x in this block
    int y;
    int *restrict px = &x;
    y = *px + x;           // invalid: access to x not exclusively through px
}

With these rules in mind, LLVM faces two obstacles:

  • Lack of scope info in IR. Clang does not emit any IR metadata to tell LLVM which block a local or global pointer was declared in (unlike function-argument attrs). Without knowing the original scope, LLVM can’t enforce the “within the block” restriction directly.
  • Tracking “based on” is hard. C’s definition of what counts as an expression “based on” P is notoriously tricky (as discussed in the 2017 slide).

To begin with, let’s consider the first point.

TBAA to the rescue

Our core idea is to leverage LLVM’s existing Type-Based Alias Analysis (TBAA) mechanism. We treat each restrict pointer as having a unique type-domain ID, so that it does not alias with other types – but in a scope-sensitive way. Concretely, we will generate a TBAA tag (metadata) for each pointer (of type T* restrict). Naïvely, one might think “just give each restrict pointer a fresh TBAA ID (hence noalias with all others)”. That doesn’t work because of C’s nuances. Consider this code:

int *foo();

void bar() {
    int *p = foo();
    *p = 1;           // First store
    { // new block
        int *restrict rp = foo();
        *rp = 2;      // Second store
    }
    *p = 3;           // Third store
}

This code is valid, but the stores clearly alias (foo() presumably returns the same address). The compiler cannot reorder these stores arbitrarily. In particular, it must not treat *rp = 2 as independent of *p = 1 and *p = 3, even though rp is restrict. The naive approach of tagging rp with a new TBAA ID that aliases with nothing would forbid any interaction between *p and *rp, disallowing the valid memory dependency above.

Scope encoding

A C-program is an N-ary scope tree. First of all, let’s notice that any tree can be represented as binary in LCRS (Left-child right-sibling) representation, so here and around we consider binary trees.
To encode “parent-child” relationships between scopes effectively we decide to use Huffman coding: Root is encoded with “1”. Right child add zero to right, left child – add one to the right.

If there is at least one restrict pointer of type T in the scope body, we must also encode other pointers in inner scopes.

This representation makes it straightforward to determine whether pointers originate from:

  • Parent-child scopes, or

  • Sibling scopes

The next step is mark all restrict pointers as restrict with the unique id.

Provenance (pointer copies)

One subtle point is pointer assignment and “provenance.” Consider:

int *restrict x;
int *y = x;     // Is y now 'based on' x?

According to the strictest reading of C99 examples, one might worry: does copying a restrict pointer into another imply that y is “based on” x and thus must itself be used restrictively? In practice, almost all guidance and compiler implementations say no: an unrestricted pointer y that is merely assigned from a restrict pointer x does not inherit the alias restriction. TI’s documentation explicitly states that “A pointer expression based on a copy of P is not based on P, since the copy is not affected by a change to P”, and that code like

void f(short *restrict p1) {
    short *p2 = p1;
    *p1 = ...;   // and later *p2 = ...
}

is undefined

It seems to me they are right. Also I write a corresponded proposal to the C language standard and waiting for review for now.

Conclusion

I believe it’s possible to support restrict finally. We plan to prototype this patch soon and welcome feedback and design review from the LLVM community. If you see any potential issues, corner cases, or have alternative suggestions, please don’t hesitate to share your thoughts. Your insights will help us refine this design and move toward a complete and practical implementation of restrict support in LLVM.

We look forward to your comments and suggestions.

References

  1. LLVM dev post: LLVM mostly ignores the ‘restrict’ qualifier
  2. Jeroen Dobbelaere’s 2019 RFC: Full restrict support in LLVM
  3. LLVM Dev Meeting 2017 – Restrict-Qualified Pointers in LLVM (PDF)
  4. C2y Working Draft (WG14 N3220), ISO/IEC JTC1 SC22 WG14
  5. TI whitepaper: Performance Tuning with the RESTRICT Keyword
  6. WG14 Defect Report N3659 – Considering expressions based on restrict pointers as pure rvalue expressions
8 Likes

CC @dobbelaj @nikic for awareness and opinions

With regard to these examples, I don’t see any ambiguity in the definition of “based on” from the standard (section “Formal definition of restrict”):

In what follows, a pointer expression E is said to be based on object P if (at some
sequence point in the execution of B prior to the evaluation of E) modifying P to point to
a copy of the array object into which it formerly pointed would change the value of E.
Note that ‘‘based’’ is defined only for expressions with pointer types.

Modifying P (x) before the initial assignment of variable y will change the value of a pointer expression using y , so yes, it is based on x.

I would also argue that LLVM is already interpreting such a pointer as based on the restrict pointer, since it is making optimizations using this information, note how it does not reload *q at the end of function test2: Compiler Explorer

1 Like

Hi @vbe-sc ,

I do agree with @brunodf about the ‘based on’ interpretation. IMHO, the standard is pretty clear in that respect.

Wrt Bruno’s godbolt example, also gcc follows this ‘based on’ interpretation.

Note, Bruno’s quote about ‘based on’ can be found in these locations:

  • C99 (or at least almost ?): n2310.pdf :6.7.3.1 Formal definition of restrict; paragraph 3 ;
  • (C11: ) n1548.pdf: 6.7.3.1 Formal definition of restrict; paragraph 3,
  • @vbe-sc’s reference 4: n3220.pdf: 6.7.4.2 Formal definition of restrict; paragraph 3

Wrt the Full restrict patches [2]: it indeed is too bad that a version of these patches is not yet part of main llvm. I am aware of a number of products/companies that have integrated the ‘full restrict patches’ [2] in their own compiler (like [0]).

Let me know if you can help out with that effort, so we can have such support in the main llvm without the need to change the meaning of ‘restrict’. We can also discuss this in the LLVM AA Tech Calls [3]. The next one is on August 11.

Greetings,

Jeroen Dobbelaere

[0] https://www.synopsys.com/designware-ip/processor-solutions/asips-tools/asip-newsletters/asip-eupdate-april-2020.html

[1] My presentation about Full Restrict (LLVM Dev Meeting 2021):

[2] Full Restrict Support; my latest public update of the patches:

[3] [email protected]

2 Likes

You’ll want to be careful that you don’t try to stuff those IDs in a 64-bit integer or similar, since they can easily be much longer than 64-bits due to code nesting.

General idea is interesting but the proposal to standard requires many more detailed examples of useful, unuseful, valid and invalid/undefined behavior.

Currently it seems that new restrict is inconvenient to use and introduces new undefined behavior.

Is this a valid code or undefined behavior? Does restrict help here if it’s valid?

void copy(void *restrict pv1, void *restrict pv2) {
    int *p1 = pv1;
    int *p2 = pv2;
    *p1 = *p2; // or a loop
}

How about an approach that doesn’t propagate “based on” through function calls but does full propagation otherwise? Or one where function result and results through pointer arguments are always considered “based on” arguments and global variables?

If tracking the “based on” relationship is considered too hard, the standard does already suggest that an implementation could limit itself to an analysis based on the declarations of pointer variables alone. This would better fit with an implementation based on TBAA, but it would not exploit the full implications of restrict. Given two pointer declarations int * restrict p and int *q with overlapping scope it would conservatively have to assume that memory access using p could alias memory access using q. (As said above, I think such a weaker implementation would deviate from the way LLVM is already handling restrict today, but at least this would still be compatible with the standard.)

Concretely, in N3220, I’m referring to 6.7.4.2, examples 5-7, where the description of example 7 reads (emphasis mine):

Here the translator can make the no-aliasing inference only by analyzing the body of the function and proving that q cannot become based on p. Some translator designs may choose to exclude this analysis, given availability of the more effective alternatives described previously.

These examples and this guidance was added by N2260 which further explains the reasoning.

Thank you, you’re absolutely right. I’ve been considering this issue and am currently exploring the idea of imposing a reasonable limit on scope nesting depth (e.g., 32 levels), beyond which the optimization would not be applied. Another option could be using APInt for arbitrary-precision arithmetic. Either way, this is more of a technical implementation detail.

1 Like

Thank you for this observation. Indeed, according to the latest version of the standard, the code

void copy(void *restrict pv1, void *restrict pv2) {
    int *p1 = pv1;
    int *p2 = pv2;
    *p1 = *p2; // or a loop
}

you provided is valid (assuming pv1 and pv2 do not point to the same lvalue, of course). However, from my perspective, the fact that such code is valid – and furthermore, that it implies restrict-specific optimizations under detailed compiler analysis – is more of a language defect and should instead be classified as undefined behavior. Below is a detailed explanation from my proposal justifying this stance:

The contradiction is as follows: the restrict keyword is a type qualifier for pointer types, meaning it gives a corresponding pointer certain additional properties, as described in Section 6.7.4.2 of the standard. However, when we create a copy of a restrict-qualified pointer into a non-restrict pointer – as shown in Example 4 – this copy becomes based on the original and effectively inherits the same properties, even though it is not itself qualified with restrict.

Here’s where my proposed changes to the standard explicitly mark this behavior as undefined:

Here, the translator can make the no-aliasing inference because, if q becomes
based on p and there are both a write and a read accesses through q and p,
respectively, the behavior is undefined.

And this is (unfortunately) true. But from my perspective, this added complexity for compiler analysis seems poorly justified. While N2260 provides some guidelines for programmers when using restrict-qualified function parameters (like adding const), the situation appears more problematic for local restrict pointers and restrict-qualified structure fields. In these cases, function bodies might require unpleasant boilerplate code.

Programmers use restrict for a reason, and I believe (though I might be overly categorical here) that we could standardize more liberal compiler behavior for such cases - giving implementations greater freedom to optimize while reducing unexpected constraints on programmers

I’m strongly against such broad expansion of undefined behavior. It basically says that restrict-pointers are forbidden to be assigned anywhere else (and forbidden in the worst form: as undefined behavior). And I really doubt that standard committee would approve turning significant amount of currently valid code into undefined behavior.

A local change for properly explained hard case (function call) is way more likely to receive approval than change of general meaning that would force not just a code review but almost certain a code rewrite for anything using restrict.

1 Like

We discussed this briefly in the clang area team meeting.

Our understanding is that according to this proposal, existing code which assigns a restrict pointer to a non-restrict variable or parameter has undefined behavior. To the extent you’re proposing that, that doesn’t seem like something we can accept: the potential breakage is way too widespread. The standard definition, all existing implementations, and a lot of existing code expect that capturing is allowed.

That said, we’re generally interested in improvements to restrict, or restrict-like capabilities.