Lifetime Analysis Improvements in Clang

Introduction

UaF (Use-after-free), UaR (Use-after-return) or dangling pointer/reference are common in C++, often leading to crashes and security vulnerabilities. Clang already provides two mechanisms for detecting temporal memory safety violations: [[clang::lifetimebound]] annotation and [[gsl::Owner/Pointer]] (a partial implementation of P1179).

In the past few months, we’ve been working on improving Clang’s lifetime analysis to increase the detection of UaFs and UaRs.

This thread is to share our ongoing progress and plans in this area, and we’re interested to hear the feedback from the community as we continue improving these features.

(This work is a collaborative effort by @usx95, @ilya, @kinu, @higher-performance and myself. Special thanks to @Xazax-hun for their thorough code reviews and valuable feedback.)

Recent Progress

We’ve made a series of improvements to Clang’s lifetime analysis to enhance its ability to effectively detect and diagnose more lifetime-related bugs. Here’s what we’ve accomplished:

  • Detect Dangling References in Assignments: Two new diagnostics, -Wdangling-assignment and -Wdangling-assignment-gsl, have been implemented and enabled by default, helping to catch more dangling reference issues in assignments (#63310, #54492).
  • Align lifetimebound and GSL Analysis: We’ve fixed subtle issues caused from the interaction between [[clang::lifetimebound]] and [[gsl::Owner/Pointer]] annotations, making their combined behavior more consistent and easier to reason about (#100549, #93386, #108272).
  • Beyond the specific points mentioned above, we’ve also made multiple enhancements, bug fixes, and cleanups (e.g. #100384, #106372, #100567, #81589).

Ongoing Work

Better support for nested container case

Containers of pointers like std::vector<std::string_view>, std::optional<std::string_view> present a common source of dangling reference issues. They can happen during initialization, assignment or other setter operations. For example:

std::vector<std::string_view> v = {std::string();} // dangling
v = {std::string()}; // dangling ref

std::optional<std::string_view> o = std::string(); // dangling
o = std::string(); // dangling ref

We are working on extending Clang’s analysis to detect dangling references in these cases (see the working PRs).

Extension for more general capture case via a new attribute [[clang::lifetime_capture_by(X)]]

The current [[clang::lifetimebound]] annotation is limited in scope:

  • it only expresses a lifetime relationship between a function argument and the return value (or this object).
  • it cannot express more general lifetime relationships, such as expressing a connection of the lifetime for a function argument to this object if a function is not a constructor.

To address this, we propose the [[clang::lifetime_capture_by(X)]] annotation, which allows developers to express more general lifetime relationships beyond the simple-to-return-value connection. This would enable Clang to detect UaFs in cases like the following:

std::string StrCat(std::string_view, std::string_view);
void test() {
   std::vector<std::string_view> v;
   v.push_back(StrCat("foo", "bar")); // now v holds a dangling string_view after the statement.
}

Please see the RFC for more details.

Improve Documentation for [[clang::lifetimebound]] and GSL Annotations

While documentation exists for these annotations, it does not provide a high-level overview. From our experience, developers can easily be confused by the overlap between [[clang::lifetimebound]] and GSL Owner/Pointer annotations. We plan to consolidate all related content into a single user-facing documentation (e.g. “Clang’s Lifetime Analysis”) that will provide a comprehensive overview and explanation of how to use these features effectively.

Possible future directions

We are still discussing what future improvements we could make. There are a few possibilities that were brought up in the discussions. However, we have not finalized our commitments to any of those, so we would welcome alternative suggestions and we also want to stress that it’s not a given that we will be necessarily working on that.

Beyond Statement-Local Analysis

Currently, Clang’s analysis is limited to a single statement. One potential direction is to extend the analysis to function-local based on control-flow-graphs (CFG), providing more comprehensive coverage of lifetime-related issues.

An alternative is performing a full dataflow analysis based on the dataflow framework, which could be implemented in CalngTidy. However, expanding the scope of the analysis directly in Clang remains valuable, especially given Clang’s existing support for CFG-based diagnostics.

Rust-style lifetime annotations

The Rust-style lifetime annotation (proposal) is a general-purpose approach. It can express more advanced lifetime contracts for arbitrary numbers of function parameters and the return value. However, implementing this right away in Clang would significantly increase the complexity, far beyond what is available in Clang today.

Improvements to thread-safety annotations

Although it isn’t directly related to memory safety, it is another important area where we can improve overall program safety. The current analysis has some known limitations (e.g. #97550) that we could address to improve thread-safety detection.

Additional ideas are welcome.

5 Likes

Hey,

Thank you all for the recent improvements, I am a big fan of these changes as they will find many bugs in many code bases and have virtually no false positives.

We plan to consolidate all related content into a single user-facing documentation

I think this is a great plan! Clang already has top-level documentation to many of its safety features including -fbounds-safety. I think it could be useful to also reference some of the related Clang Tidy and Clang Static Analyzer checks. In particular, the cplusplus.InnerPointer check is particularly relevant.

The CSA can also find some other cases of dangling by modeling allocations through new/delete and it supports inter-procedural analysis. As a result, when relevant methods of a user defined type properly analyzed it can occasionally find temporal memory errors in modern C++ code.

More documentation can also help to encourage other compilers to also respect these annotations.

Possible future directions

I think this section outlines some really good ideas but I wanted expand on the possibilities:

  • Currently, the cplusplus.InnerPointer checker does not consider the lifetime annotations. Adding support for that would be immensely beneficial for the users of CSA. One of the advantages of CSA is that it can find some errors that span across multiple functions. This is really useful for code that is not yet fully annotated.
  • Clang applies lifetimebound annotations to a select APIs from the standard library. This could be expanded. Moreover, it is currently hard coded in the compiler. This can get hard to maintain as soon as we want to annotate bigger API surfaces like ranges. Moving these out of the compiler and using APINotes might be a better approach. (The benefit of APINotes over annotating libcxx directly is that it should work with other standard library implementations, like the ones from Microsoft and GCC.)
  • Add some Clang-Tidy checks that could suggest inserting annotations in some simple scenarios where we do not need to do any dataflow analysis to know that a lifetimebound annotation is missing. This could help “propagating” these annotations in a project using the annotations in the standard library as a seed for this process.
  • We could explore the options for integrating some of these lifetime annotations into the type system:
    • Could be part of the function type and we could warn on unsafe conversions
    • We could warn about overrides that are violating the substitution principle
  • Mixing multiple kinds of lifetime annotations can be confusing, We already have many in Clang like the gsl::Pointer and lifetimebound, and some in proposals, like the Rust-style annotations. It is not impossible that we might end up having even more in the future. I think it would be nice if users could convert between the different representations when possible, e.g., having a tidy check that takes some code with Owner/Pointer annotations and replaces those with semantically equivalent lifetimebound annotations. Moreover, whenever we have a clear mapping between some annotations, we could potentially simplify the implementation in Clang by doing some desugaring first and handling fewer cases in the lifetime analysis.

One potential direction is to extend the analysis to function-local based on control-flow-graphs (CFG)

One question is, whether using ClangIR would provide any benefits over CFG for this analysis. That being said, ClangIR is unlikely to be production ready at this point and we should not block any improvements to Clang on that.

2 Likes

@bcardosolopes

1 Like

Hi Gábor,

Thank you for the thoughtful replies.

I think it could be useful to also reference some of the related Clang Tidy and Clang Static Analyzer checks.

Definitely. Centralizing all related tools in the documentation will be especially useful for users who may not yet be familiar with the full range of capabilities that Clang offers.

I think this section outlines some really good ideas but I wanted expand on the possibilities:

I appreciate your suggestions, these are great ideas.

Currently, the cplusplus.InnerPointer checker does not consider the lifetime annotations.

+1, agree that enhancing CSA to support lifetime annotations is beneficial, particularly for catching more use-after-free issues.

We’re not heavy users of CSA. In our past experience, while path-sensitive checks are useful, CSA sometimes struggled with understanding C++, and the false positive rate was high in our internal code base (though I’m not sure about the current status). Still, I think this is certainly an area worth exploring further.

Add some Clang-Tidy checks that could suggest inserting annotations in simple scenarios

Yes, auto-inferencing [[lifetimebound]] to encourage more annotated code is definitely a direction worth pursuing (this could even be implemented as a clang warning rather than a clang-tidy check).

A simple example would be getter methods:

struct Foo {
   std::string name;
   const std::string& getName() const { return name; } // Infer lifetimebound.
};

A method that returns an owner member reference should be marked as lifetimebound to the this object.

One question is whether using ClangIR would provide any benefits over CFG for this analysis.

Thanks for bringing up ClangIR – I haven’t focused much on it yet, but it’s an interesting option, especially given that it implements P1179 for lifetime checks.

This is cool work and status update, thanks for sharing!

+1, my take is that this is overall goodness and any improvement made here should translate into benefits later for ClangIR passes. Even though CIR could provide benefits over CFG, it’s not mature enough to block any upstream work.

It’d also be cool to support the lifetimebound stuff!