#service-discovery #fault-tolerant #gossip #memberlist #decentralized-networking #swim #serf

ruserf

A decentralized solution for service discovery and orchestration that is lightweight, highly available, and fault tolerant

2 unstable releases

0.1.0 Apr 15, 2024
0.0.0 May 18, 2023

#920 in Network programming

Download history 1/week @ 2024-07-24 9/week @ 2024-07-31 5/week @ 2024-09-18 10/week @ 2024-09-25 11/week @ 2024-10-02

85 downloads per month

MPL-2.0 license

685KB
18K SLoC

RuSerf

A highly customable, adaptable, runtime agnostic and WASM/WASI friendly decentralized solution for service discovery and orchestration that is lightweight, highly available, and fault tolerant.

Port and improve HashiCorp's serf to Rust.

github LoC Build codecov

docs.rs crates.io crates.io license

English | 简体中文

Introduction

ruserf is a decentralized solution for service discovery and orchestration that is lightweight, highly available, and fault tolerant.

The use cases for such a library are far-reaching: all distributed systems require membership, and ruserf is a re-usable solution to managing cluster membership and node failure detection.

ruserf is eventually consistent but converges quickly on average. The speed at which it converges can be heavily tuned via various knobs on the protocol. Node failures are detected and network partitions are partially tolerated by attempting to communicate to potentially dead nodes through multiple routes.

ruserf is WASM/WASI friendly, all crates can be compiled to wasm-wasi and wasm-unknown-unknown (need to configure the crate features).

Design

Unlike the original Go implementation, Rust's ruserf use highly generic and layered architecture, users can easily implement a component by themselves and plug it to the ruserf. Users can even custom their own Id and Address.

Here are the layers:

  • Transport Layer

    By default, Rust's ruserf provides two kinds of transport -- QuicTransport and NetTransport.

    • Runtime Layer

      Async runtime agnostic are provided by agnostic's Runtime trait, tokio, async-std and smol are supported by default. Users can implement their own Runtime and plug it into the ruserf.

    • Address Resolver Layer

      The address resolver layer is supported by nodecraft's AddressResolver trait.

    • Serialize/Deserilize Layer

      By default, Rust's ruserf is using length-prefix encoding (Lpe) to serialize/deserialize messages to bytes or visa-vise. The implemention of Lpe tries the best to avoid reallocating when doing the serialize/deserialize.

      But, users can use any other serialize/deserialize framework by implementing Wire trait.

    • NetTransport

      Three kinds of different builtin stream layers for NetTransport:

    • QuicTransport

      QUIC transport is an experimental transport implementation, it is well tested but still experimental.

      Two kinds of different builtin stream layers for QuicTransport:

    Users can still implement their own stream layer for different kinds of transport implementations.

  • Delegate Layer

    This layer is used as a reactor for different kinds of messages.

    • Delegate

      Delegate is the trait that clients must implement if they want to hook into the gossip layer of Serf. All the methods must be thread-safe, as they can and generally will be called concurrently.

      Here are the sub delegate traits:

      • MergeDelegate

        Used to involve a client in a potential cluster merge operation. Namely, when a node does a promised push/pull (as part of a join), the delegate is involved and allowed to cancel the join based on custom logic. The merge delegate is NOT invoked as part of the push-pull anti-entropy.

      • TransformDelegate

        A delegate for encoding and decoding. Used to control how ruserf should encode/decode messages.

      • ReconnectDelegate

        Used to custom reconnect behavior, users can implement to allow overriding the reconnect timeout for individual members.

    • CompositeDelegate

      CompositeDelegate is a helpful struct to split the Delegate into multiple small delegates, so that users do not need to implement full Delegate when they only want to custom some methods in the Delegate.

Protocol

ruserf is based on "SWIM: Scalable Weakly-consistent Infection-style Process Group Membership Protocol". However, Hashicorp developers extends the protocol in a number of ways:

Several extensions are made to increase propagation speed and convergence rate. Another set of extensions, that Hashicorp developers call Lifeguard, are made to make ruserf more robust in the presence of slow message processing (due to factors such as CPU starvation, and network delay or loss). For details on all of these extensions, please read Hashicorp's paper "Lifeguard : SWIM-ing with Situational Awareness", along with the ruserf source.

Installation

[dependencies]
ruserf = "0.1"

Q & A

  • Does Rust's ruserf implemenetation compatible to Go's serf?

    No but yes! By default, it is not compatible. But the secret is the serialize/deserilize layer, Go's serf use the msgpack as the serialization/deserialization framework, so in theory, if you can implement a TransformDelegate trait which compat to Go's serf, then it becomes compatible.

  • If Go's serf adds more functionalities, will this project also support?

    Yes! And this project may also add more functionalities whereas the Go's serf does not have. e.g. wasmer support, bindings to other languages and etc.

  • agnostic: helps you to develop runtime agnostic crates
  • nodecraft: crafting seamless node operations for distributed systems, which provides foundational traits for node identification and address resolution.
  • transformable: transform its representation between structured and byte form.
  • peekable: peekable reader and async reader
  • memberlist: A highly customable, adaptable, runtime agnostic and WASM/WASI friendly Gossip protocol which helps manage cluster membership and member failure detection.

License

ruserf is under the terms of the MPL-2.0 license.

See LICENSE for details.

Copyright (c) 2024 Al Liu.

Copyright (c) 2013 HashiCorp, Inc.

Dependencies

~15–35MB
~565K SLoC