AST Structure Strings

Journeyman1337 · November 12, 2024, 4:33pm

I have two questions I am asking here about similar things. I am trying to create an AST node for my custom language frontend. My language is homoiconic, and the AST is a bit like a Lisp structure. I want to store strings in the AST so that I can access them multiple times throughout the compilation process. I am wondering if I should use llvm::StringRef or std::string to store the strings. I am reading my source files using the llvm::MemoryBuffer API. Should I store my strings as llvm::StringRef directly from llvm::MemoryBuffer, should I allocate new memory with an std::string, or is there something else I should do?

Another thing is that I am trying to make the AST nodes have a small memory footprint. I am using a union in the type definition to save on space. I know that if I store an std::string in a union this would complicate destruction of AST nodes. If I store an llvm::StringRef in a union, I don’t think this will cause issues on destruction. Am I wrong about this?

AaronBallman · November 13, 2024, 2:10pm

It deoends on the lifetime of the data: StringRef doesn’t own its backing buffer, so if the data needs to outlive the memory buffer, then I’d recommend using std::string and the overhead of the copy. If the data lives as long as the memory buffer, then a StringRef is a reasonable option to avoid the copies.

I think you’re fine to use StringRef in a union without complicating destruction because StringRef has a trivial destructor.

Topic		Replies	Views
Std::string vs llvm::StringRef Beginners llvm	2	1700	October 11, 2022
Question about storing data in AST node Clang Frontend	2	131	April 4, 2024
Little question about Stringref LLVM Dev List Archives	3	97	April 22, 2014
raw c strings in lldb LLDB	17	154	February 25, 2015
std::string LLVM Dev List Archives	19	153	January 24, 2013

AST Structure Strings

Related topics