2 unstable releases

0.3.0 Feb 20, 2021
0.2.0 Dec 15, 2020

#17 in #data-integration


Used in fluvio-cluster

Apache-2.0

465KB
11K SLoC

Fluvio is a lean and mean distributed data streaming engine written in Rust. Combined with Stateful DataFlow distributed stream processing framework, Fluvio provides a unified composable distributed streaming and stream processing paradigm for developers. It is the foundation of InfinyOn Cloud.

Quick Start - Get started with Fluvio and Stateful DataFlow in 5 minutes or less on your system!

Step 1. Download Fluvio Version Manager:

Fluvio is installed via the Fluvio Version Manager, shortened to fvm.

To install fvm, run the following command:

curl -fsS https://hub.infinyon.cloud/install/install.sh | bash

As part of the initial setup, fvm will also install the Fluvio CLI available in the stable channel as of the moment of installation.

Fluvio is stored in $HOME/.fluvio, with the executable binaries stored in $HOME/.fluvio/bin.

For the best compatibliity on Windows, InfinyOn recommends WSL2

Step 2. Start a cluster:

Start cluster on you local machine with the following command:

fluvio cluster start

Step 3. Install SDF CLI

Stateful dataflows are managed via sdf cli that we install it using fvm.

fvm install sdf-beta4

Step 4. Create the Dataflow file

Create a dataflow file in the directory split-sentence directory:

mkdir -p split-sentence-inline
cd split-sentence-inline

Create the dataflow.yaml and add the following content:

apiVersion: 0.5.0

meta:
  name: split-sentence-inline
  version: 0.1.0
  namespace: example

config:
  converter: raw

topics:
  sentence:
    schema:
      value:
        type: string
        converter: raw
  words:
    schema:
      value:
        type: string
        converter: raw

services:
  sentence-words:
    sources:
      - type: topic
        id: sentence

    transforms:
      - operator: flat-map
        run: |
          fn sentence_to_words(sentence: String) -> Result<Vec<String>> {
            Ok(sentence.split_whitespace().map(String::from).collect())
          }
      - operator: map
        run: |
          pub fn augment_count(word: String) -> Result<String> {
            Ok(format!("{}({})", word, word.chars().count()))
          }

    sinks:
      - type: topic
        id: words

Step 5. Run the DataFlow

Use sdf command line tool to run the dataflow:

sdf run --ui

The --ui flag serves the graphical representation of the dataflow on SDF Studio.

Step 6. Test the DataFlow

Produce sentences to in sentence topic:

fluvio produce sentence

Input some text, for example:

Hello world
Hi there

Consume from words to retrieve the result:

fluvio consume words -Bd

See the results, for example:

Hello(1)
world(1)
Hi(1)
there(1)

Step 6. Inspect State

The dataflow collects runtime metrics that you can inspect in the runtime terminal.

Check the sentence-to-words counters:

show state sentence-words/sentence-to-words/metrics

See results, for example:

 Key    Window  succeeded  failed
 stats  *       2          0

Check the augment-count counters:

show state sentence-words/augment-count/metrics

See results, for example:

Key    Window  succeeded  failed
stats  *       4          0

Congratulations! You've successfully built and run a composable dataflow!

More examples of Stateful DataFlow are on GitHub - https://github.com/infinyon/stateful-dataflows-examples/.

Check Fluvio Core Documentation

Fluvio documentation will provide additional context on how to use the Fluvio clusters, CLI, clients, a development kits.

Check Stateful DataFlow Documentation

Stateful DataFlow designed to handle complex data processing workflows, allowing for customization and scalability through various programming languages and system primitives.

Learn how to build custom connectors

Fluvio can connect to practically any system that you can think of.

  • For first party systems, fluvio clients can integrate with the edge system or application to source data.
  • For third party systems fluvio connectors connect at the protocol level and collects data into fluvio topics.

Out of the box Fluvio has native http, webhook, mqtt, kafka inbound connectors. In terms of outbound connectors out of the box Fluvio supports http, SQL, kafka, and experimental builds of DuckDB, Redis, S3, Graphite etc.

Using Connector Development Kit, its intuitive to build connectors to any system fast.

Check out the docs and let us know if you need help building any connector.

Learn how to build custom smart modules

Fluvio applies wasm based stream processing and data transformations. We call these reusable transformation functions smart modules. Reusable Smart modules are built using Smart Module Development Kit and can be distributed using InfinyOn Cloud hub.

Try workflows on InfinyOn Cloud

InfinyOn Cloud is Fluvio on the cloud as a managed service.

Clients

Language Specifc API docs:

Community Maintained:

Contributing

If you'd like to contribute to the project, please read our Contributing guide.

Community

Many fluvio users and developers have made projects to share with the community. Here a a few listed below:

Projects Using Fluvio

Community Connectors

Community Development Resources

More projects and utilities are available in the Fluvio Community Github Org

Contributors are awesome

Made with contrib.rocks.

License

This project is licensed under the Apache license.

Dependencies

~29–45MB
~694K SLoC