#middleware #load #hyper

little-loadshedder

Latency-based load-shedding hyper/tower middleware

2 unstable releases

0.2.0 Feb 24, 2024
0.1.0 Jan 30, 2023

#64 in #load

Download history 3142/week @ 2024-09-24 3252/week @ 2024-10-01 3077/week @ 2024-10-08 3115/week @ 2024-10-15 3369/week @ 2024-10-22 3522/week @ 2024-10-29 2906/week @ 2024-11-05 1787/week @ 2024-11-12 1979/week @ 2024-11-19 1666/week @ 2024-11-26 1881/week @ 2024-12-03 2365/week @ 2024-12-10 1024/week @ 2024-12-17 199/week @ 2024-12-24 614/week @ 2024-12-31 1203/week @ 2025-01-07

3,423 downloads per month

MIT/Apache

26KB
263 lines

Little Loadshedder

Crates.io API reference License

A Rust hyper/tower service that implements load shedding with queuing & concurrency limiting based on latency.

It uses Little's Law to intelligently shed load in order to maintain a target average latency. It achieves this by placing a queue in front of the service it wraps, the size of which is determined by measuring the average latency of calls to the inner service. Additionally, it controls the number of concurrent requests to the inner service, in order to achieve the maximum possible throughput.

The following images show metrics from the example server under load generated by the example client.

First, when load is turned on, some requests are rejected while the middleware works out the queue size, and increases the concurrency. startup This quickly resolves to a steady state where the service can easily handle the load on it, and so the queue size is large.

Next, we send a large burst of traffic, note that none of it is dropped as it is all absorbed by the queue. burst Once the burst stops, the service slowly clears it's backlog of requests and returns to the steady state.

Now we simulate a service degradation, all requests are taking twice as long to process. The queue shrinks to about half its original size in order to hit the target average latency, however the service cannot acheive this throughput any longer so the queue fills and requests are rejected. degradation Note that from the client's point of view requests are either immediately rejected or complete at roughly the target latency.

Now the service degrades substantially. The queue shrinks to almost nothing and the concurrency is slowly reduced until the latency matches the goal. slow

Finally, the service recovers, the middleware rapidly notices and returns to it's inital steady state. recovery

License

Licensed under either of

at your option.

Contribution

This project welcomes contributions and suggestions, just open an issue or pull request!

Unless you explicitly state otherwise, any contribution intentionally submitted for inclusion in the work by you, as defined in the Apache-2.0 license, shall be dual licensed as above, without any additional terms or conditions.

Dependencies

~3–9.5MB
~89K SLoC