126 releases (11 breaking)
new 0.19.1 | Nov 5, 2024 |
---|---|
0.19.0 | Oct 17, 2024 |
0.17.0 | Jul 8, 2024 |
0.15.0-alpha.5 | Mar 29, 2024 |
0.8.0 | Jul 27, 2023 |
#301 in Procedural macros
18,342 downloads per month
Used in 50 crates
(via re_types)
760KB
16K
SLoC
re_types_builder
Part of the rerun
family of crates.
This crate implements Rerun's code generation tools.
These tools translate language-agnostic IDL definitions (flatbuffers) into code.
You can generate the code with pixi run codegen
.
Doclinks
The .fbs
files can contain docstring (///
) which in turn can contain doclinks.
They are to be written on the form [archetypes.Image]
.
Only links to types are currently supported.
Link checking is not done by the codegen, but the output is checked implicityly by cargo doc
, lychee
etc.
We only support doclinks to the default rerun.scope
.
lib.rs
:
This crate implements Rerun's code generation tools.
These tools translate language-agnostic IDL definitions (flatbuffers) into code.
They are invoked by pixi run codegen
.
Organization
The code generation process happens in 4 phases.
1. Generate binary reflection data from flatbuffers definitions.
All this does is invoke the flatbuffers compiler (flatc
) with the right flags in order to
generate the binary dumps.
Look for compile_binary_schemas
in the code.
2. Run the semantic pass.
The semantic pass transforms the low-level raw reflection data generated by the first phase into higher level objects that are much easier to inspect/manipulate and overall friendlier to work with.
Look for objects.rs
.
3. Fill the Arrow registry.
The Arrow registry keeps track of all type definitions and maps them to Arrow datatypes.
Look for arrow_registry.rs
.
4. Run the actual codegen pass for a given language.
We currently have two different codegen passes implemented at the moment: Python & Rust.
Codegen passes use the semantic objects from phase two and the registry from phase three in order to generate user-facing code for Rerun's SDKs.
These passes are intentionally implemented using a very low-tech no-frills approach (stitch
strings together, make liberal use of unimplemented
, etc) that keep them flexible in the
face of ever changing needs in the generated code.
Look for codegen/python.rs
and codegen/rust.rs
.
Error handling
Keep in mind: this is all build-time code that will never see the light of runtime. There is therefore no need for fancy error handling in this crate: all errors are fatal to the build anyway.
Make sure to crash as soon as possible when something goes wrong and to attach all the
appropriate/available context using anyhow
's with_context
(e.g. always include the
fully-qualified name of the faulty type/field) and you're good to go.
Testing
Same comment as with error handling: this code becomes irrelevant at runtime, and so testing it brings very little value.
Make sure to test the behavior of its output though: re_types
!
Understanding the subtleties of affixes
So-called "affixes" are effects applied to objects defined with the Rerun IDL and that affect the way these objects behave and interoperate with each other (so, yes, monads. shhh.).
There are 3 distinct and very common affixes used when working with Rerun's IDL: transparency, nullability and plurality.
Broadly, we can describe these affixes as follows:
- Transparency allows for bypassing a single layer of typing (e.g. to "extract" a field out of a struct).
- Nullability specifies whether a piece of data is allowed to be left unspecified at runtime.
- Plurality specifies whether a piece of data is actually a collection of that same type.
We say "broadly" here because the way these affixes ultimately affect objects in practice will actually depend on the kind of object that they are applied to, of which there are 3: archetypes, components and datatypes.
Not only that, but objects defined in Rerun's IDL are materialized into 3 distinct environments: IDL definitions, Arrow datatypes and native code (e.g. Rust & Python).
These environment have vastly different characteristics, quirks, pitfalls and limitations, which once again lead to these affixes having different, sometimes surprising behavior depending on the environment we're interested in. Also keep in mind that Flatbuffers and native code are generally designed around arrays of structures, while Arrow is all about structures of arrays!
All in all, these interactions between affixes, object kinds and environments lead to a combinatorial explosion of edge cases that can be very confusing when it comes to (de)serialization code, and even API design.
When in doubt, check out the rerun.testing.archetypes.AffixFuzzer
IDL definitions, generated code and
test suites for definitive answers.
Dependencies
~12–26MB
~418K SLoC