#pipe #context #report #assets #json #data #dagster

bin+lib dagster_pipes_rust

A Dagster pipes implementation for interfacing with Rust

8 releases

0.1.7 Dec 29, 2024
0.1.6 Dec 3, 2024
0.1.4 Nov 27, 2024

#453 in Encoding

Download history 426/week @ 2024-11-26 176/week @ 2024-12-03 13/week @ 2024-12-10 4/week @ 2024-12-17 123/week @ 2024-12-24 7/week @ 2024-12-31 3/week @ 2025-01-07

141 downloads per month

Apache-2.0

100KB
1.5K SLoC

dagster-pipes-rust

A pipes implementation for the Rust programming language.

Get full observability into your Rust workloads when orchestrating through Dagster. With this light weight interface, you can retrieve data directly from the Dagster context, report asset materializations, report asset checks, provide structured logging, end more.

Crates.io

Usage

Installation

cargo add dagster_pipes_rust

Example

An example project can be found in ./example-dagster-pipes-rust-project.

In this project there exists a rust_processing_jobs binary, which demonstrates how to use the Dagster context to report materializations to Dagster through the context.report_asset_materialization method.

use dagster_pipes_rust::open_dagster_pipes;
use serde_json::json;

fn main() {
    let mut context = open_dagster_pipes();
    let metadata = json!({"row_count": {"raw_value": 100, "type": "int"}});
    context.report_asset_materialization("example_rust_subprocess_asset", metadata);
}
image

It also demonstrates how to run the Rust binary in a subprocess from Dagster. Note, that it's also possible to launch processes in external compute environments like Kubernetes.

import shutil

import dagster as dg


@dg.asset(
    group_name="pipes",
    kinds={"rust"},
)
def example_rust_subprocess_asset(
    context: dg.AssetExecutionContext, pipes_subprocess_client: dg.PipesSubprocessClient
) -> dg.MaterializeResult:
    """Demonstrates running Rust binary in a subprocess."""
    cmd = [shutil.which("cargo"), "run"]
    cwd = dg.file_relative_path(__file__, "../rust_processing_jobs")
    return pipes_subprocess_client.run(
        command=cmd,
        cwd=cwd,
        context=context,
    ).get_materialize_result()


defs = dg.Definitions(
    assets=[example_rust_subprocess_asset],
    resources={
        "pipes_subprocess_client": dg.PipesSubprocessClient(
            context_injector=dg.PipesEnvContextInjector(),
        )
    },
)

Contributing

Pipes Schema

We use jsonschema to define the pipes protocol and quicktype to generate the Rust structs. Currently, the json schemas live in jsonschema/pipes but they should be hosted/defined in a centralized repository in the future.

To generate the Rust structs, make sure to install quicktype with npm install -g quicktype. Then run:

cd community-integrations/pipes
make jsonschema_rust

Dependencies

~1.2–2.3MB
~46K SLoC