[Rust] Add better support for "crate-per-schema" #8273

adsnaider · 2024-03-31T00:39:25Z

Currently, flatc allows using --rust-module-root-file and --gen-all to generate multiple schemas into a single crate with a top-level mod.rs. This is good but makes it really hard to use in many contexts since the best (only?) option to have inter-dependent schemas is to generate everything together into a single crate.

Ideally, we can generate each schema independently of the includes (as each include will be its own generated crate), and link them all at build time.

The text was updated successfully, but these errors were encountered:

adsnaider · 2024-03-31T00:49:58Z

I don't particularly care about doing this through flatc directly. We use Bazel downstream and I have some patches to implement this, but we're somewhat behind master so it might take me a bit of time to send a PR. It is on my radar though.

The big thing that needs to change is the use per module need to be more specific, for instance:

// foo.fbs
namespace foo;
// ... some definitions

// foobar.fbs
include "foo.fbs"
namespace foo.bar;
// ... some definitions

The generated code should be like

// foo_generated.rs

use foo_generated::*;

pub mod foo {
  use foo_generated::foo::*; // As opposed to currently `foo_generated::*;

  pub mod bar {
    // No need to include foo_generated here since it doesn't live in `foo.bar`
  }
}

I think the correct approach is to include [some_generated_dep]::[some_module_tree]::* if and only if [some_module_tree] is a subset of [some_generated_dep]'s namespace. I'm hoping someone will tell me if that doesn't work on every case :)

adsnaider · 2024-04-11T15:03:00Z

Unfortunately, this doesn't work when the imports are ambiguous. For instance:

foo.fbs -> namespace a1.b1.c1 - includes bar and baz
bar.fbs -> namespace a1.b2.c1
baz.fbs -> namespace a1.b2.c1

This will result in bar::a1::* and baz::a1::* being imported into the same a1 module which result in an ambiguous include of b2.

We will probably have to change the code generation to use absolute paths instead for our use case. Would love to upstream our work once it's ready. Would this be a reasonable contribution or would this change to absolute paths break other use cases?

github-actions · 2024-10-10T20:33:50Z

This issue is stale because it has been open 6 months with no activity. Please comment or label not-stale, or this will be closed in 14 days.

adsnaider · 2024-10-10T22:48:03Z

not-stale

adsnaider · 2024-10-10T22:48:53Z

Btw, would love to hear feedback from other users or maintainers on how they are dealing with this

csmulhern · 2025-03-17T16:12:35Z

@adsnaider I'm currently dealing with the same issue. Would love to understand how you've approached this and what a PR for this might look like.

adsnaider · 2025-03-17T16:32:56Z

@csmulhern, I haven't solved the issue yet. There are a few options that I see but I'm not sure what would be best for upstream. Part of the solution may involve the build system. I sent a message to the flatbuffer's discord but we didn't really agree on a clear option. This was my message there:

Context: I work in a large C++ codebase and we make significant use of flatbuffers. We use bazel as our build system and it makes it really simple to have each flatbuffer be its own static library. This is incredibly useful for a few reasons, the main one being that 2 distinct C++ libraries that depend on the same flatbuffer may pass flatbuffers around (as a C++ object)
Problem: The issue that I've run into -- and this is a similar problem with other code-generated schemas -- is that in Rust, the smallest unit of compilation is a crate, and each crate must have a unique name in the compilation graph. If I try to follow the same approach as with C++ of 1 flatbuffer per compilation unit 1. it becomes really hard to provide a name for each crate, and 2. more importantly, the code generation of intradependencies of flatbuffers becomes erroneous. This is because flatbuffers use relative paths to include dependencies (e.g. super::super::Foo ), but now Foo does not belong to the same crate.
Possible solutions: These are some options that I've considered, all having different trade-offs, but I'm curious if there are other solutions people have come up with/implemented

Generate all of the flatbuffers and place them into a single crate. Solves a lot of these issues but in a big enough codebase (may?) become a bottleneck in compilation. Additionally, it is not suitable when working in a modular project where you may want to include third-party flatbuffers into your hierarchy.

1 flatbuffer crate per Rust compilation unit/crate. Essentially, for each Rust crate (rust_library/rust_binary) aggregate all of the flatbuffer-dependencies required into a single crate. Fixes modularity but becomes impossible to pass a generated flatbuffer type to one of its dependencies since Foo in my crate would not be the same type as Foo in my crate dependency

Somehow change the flatbuffer generated code to figure out where to import a specific name from so that it can use absolute paths and also generate a single module that re-exports all of the types of each flatbuffer dependency into a single module hierarchy. I believe this option would solve every problem but there are some technical issues that need solving to implement it and may require a "forever patched" flatbuffer code generator

This hasn't been an urgent issue for us yet so we haven't settled on any solution yet, but I suspect we will have to come back to this eventually.

Open to discussing more and helping with an implementation if you have some ideas.

csmulhern · 2025-03-18T01:09:37Z

@adsnaider, thanks for that additional context.

The three options you've outlined are exactly the same ones that I am considering.

Option 1 seems untenable for the reason you've outlined; a real solution needs to allow the definition of a common schema that's used across projects.

Option 2 seems untenable too, as you don't have cross crate compatibility.

For Option 3, I'm not exactly clear on what you're suggesting in terms of the reexports.

As a reference, Swift is module based in terms of compilation units, and so has to solve similar challenges as Rust. They solve this by having one module (crate) per proto library, and then requiring that proto dependencies of the proto_library have their analogous module be a dependency of the generated module.

For example:

# proto/BUILD.bazel

proto_library(
    name = "foo",
    srcs = ["foo.proto"],
)

proto_library(
    name = "bar",
    srcs = ["bar.proto"],
    deps = [":foo"],
)

swift_proto_library(
    name = "foo_swift",
    protos = [":foo"],
)

swift_proto_library(
    name = "bar_swift",
    protos = [":bar"],
    deps = [":foo_swift"],
)

# foo.proto

syntax = "proto3";

message Foo {
    string field = 1;
}

# bar.proto

syntax = "proto3";

import "proto/foo.proto";

message Bar {
    Foo field = 1;
}

In the generated code, they reference the fields from imports with a fully qualified name. For instance, here is the generated code for the Bar structure in the bar_swift module:

import proto_foo_swift

public struct Bar {
  // SwiftProtobuf.Message conformance is added in an extension below. See the
  // `Message` and `Message+*Additions` files in the SwiftProtobuf library for
  // methods supported on all messages.

  public var field: proto_foo_swift.Foo {
    get {return _field ?? proto_foo_swift.Foo()}
    set {_field = newValue}
  }

  ...

where proto_foo_swift is the Bazel target name where slashes and colons have been converted to underscores (//proto:foo_swift -> proto_foo_swift). This is the module (crate) name given to the module generated by //proto:foo_swift. The equivalent in Rust would be proto_foo_rust::Foo.

This is what I think should be done for flatbuffer generated Rust code in Bazel projects. Ideally, flatc would support this style of generation, but I'm not sure the best way to achieve that (e.g. module name mappings could be provided as a command line argument). protoc uses a plugin architecture to allow custom code generation defined outside of protoc to be leveraged by protoc. flatc doesn't do this, but a custom generator could be built on top of the flatbuffers library. I believe the Swift protobuf generation inside the Bazel rules uses a custom plugin to achieve the style of generation covered above. See: https://github.com/bazelbuild/rules_swift/blob/343f35ebef603b92eb458b929a94f4ef97338d78/proto/swift_proto_compiler.bzl#L31).

I'm unclear on how Swift code generation works in Flatbuffers today, but I will take a look soon.

csmulhern · 2025-03-18T01:30:14Z

Looks like the Flatbuffer Swift code generator has the same problem. The foo / bar example above generates the following code for Bar:

import FlatBuffers

public struct Bar: FlatBufferObject, Verifiable {

  ...

  public var field: Foo? { let o = _accessor.offset(VTOFFSET.field.v); return o == 0 ? nil : Foo(_accessor.bb, o: _accessor.indirect(o + _accessor.position)) }

  ...

I.e. it imports the runtime library only, and uses Foo with the assumption that it's defined in the same module.

csmulhern · 2025-03-18T02:08:45Z

It looks like struct definitions (Definition.file) contain the path to the file they were defined in.

If we have a mapping of file path -> Bazel target, we should be able to map the Type names in a relatively straightforward way through Namer.

I'd be curious if there's an abstraction here we could supply this information to flatc using that is generic / useful enough where there would be an appetite to add this to flatc somehow.

Having to a write a whole custom code generator to support this would be quite a nuisance. The individual code generators used by flatc are not public, so this cannot be easily achieved through e.g. subclassing the current rust code generator and adding a custom codegen binary built using the flatbuffers library.

cc @dbaileychess, @aardappel, @CasperN - any idea who would be best to weigh in here?

csmulhern · 2025-03-22T04:56:36Z

I've sketched out what I think a useful version of this may look like. The API should be generalizable to all module based languages, and initial support has been added in the Rust code generator. Have a look at #8563.

csmulhern · 2025-03-28T03:29:52Z

@adsnaider, it makes me happy to see your positive reactions. Would be good to understand if the approach taken in #8563 suits your use case / expected usage.

For reference, I have written custom bazel rules wrapping flatc for Rust code generation from flatbuffer schema files using the --module-mapping flag, and have managed to have modules that generate code from schemas with dependencies that are generated by other targets.

adsnaider · 2025-03-28T22:59:48Z

@csmulhern I'm currently traveling but your solution seems reasonable and we have folks at work that are interested in trying it out for an upcoming project. I can make sure they give their feedback after trying it out

csmulhern · 2025-03-30T00:20:01Z

Glad to hear it! Cheers.

github-actions bot added the stale label Oct 10, 2024

github-actions bot removed the stale label Oct 11, 2024

csmulhern linked a pull request Mar 22, 2025 that will close this issue

Adds module mapping of includes for Rust code generation #8563

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Rust] Add better support for "crate-per-schema" #8273

[Rust] Add better support for "crate-per-schema" #8273

adsnaider commented Mar 31, 2024

adsnaider commented Mar 31, 2024

adsnaider commented Apr 11, 2024

github-actions bot commented Oct 10, 2024

adsnaider commented Oct 10, 2024

adsnaider commented Oct 10, 2024

csmulhern commented Mar 17, 2025

adsnaider commented Mar 17, 2025

csmulhern commented Mar 18, 2025 •

edited

Loading

csmulhern commented Mar 18, 2025 •

edited

Loading

csmulhern commented Mar 18, 2025 •

edited

Loading

csmulhern commented Mar 22, 2025

csmulhern commented Mar 28, 2025

adsnaider commented Mar 28, 2025

csmulhern commented Mar 30, 2025

[Rust] Add better support for "crate-per-schema" #8273

[Rust] Add better support for "crate-per-schema" #8273

Comments

adsnaider commented Mar 31, 2024

adsnaider commented Mar 31, 2024

adsnaider commented Apr 11, 2024

github-actions bot commented Oct 10, 2024

adsnaider commented Oct 10, 2024

adsnaider commented Oct 10, 2024

csmulhern commented Mar 17, 2025

adsnaider commented Mar 17, 2025

csmulhern commented Mar 18, 2025 • edited Loading

csmulhern commented Mar 18, 2025 • edited Loading

csmulhern commented Mar 18, 2025 • edited Loading

csmulhern commented Mar 22, 2025

csmulhern commented Mar 28, 2025

adsnaider commented Mar 28, 2025

csmulhern commented Mar 30, 2025

csmulhern commented Mar 18, 2025 •

edited

Loading

csmulhern commented Mar 18, 2025 •

edited

Loading

csmulhern commented Mar 18, 2025 •

edited

Loading