3 releases
0.3.4 | May 7, 2023 |
---|---|
0.3.3 | May 7, 2023 |
0.3.2 | May 7, 2022 |
#62 in Parser tooling
900 downloads per month
Used in 14 crates
(via dnssector)
325KB
6K
SLoC
Chomp
Chomp is a fast monadic-style parser combinator library designed to work on stable Rust. It was written as the culmination of the experiments detailed in these blog posts:
For its current capabilities, you will find that Chomp performs consistently as well, if not better, than optimized C parsers, while being vastly more expressive. For an example that builds a performant HTTP parser out of smaller parsers, see http_parser.rs.
Installation
Add the following line to the dependencies section of your Cargo.toml
:
[dependencies]
chomp = "0.3.1"
Usage
Parsers are functions from a slice over an input type Input<I>
to a ParseResult<I, T, E>
, which may be thought of as either a success resulting in type T
, an error of type E
, or a partially completed result which may still consume more input of type I
.
The input type is almost never manually manipulated. Rather, one uses parsers from Chomp by invoking the parse!
macro. This macro was designed intentionally to be as close as possible to Haskell's do
-syntax or F#'s "computation expressions", which are used to sequence monadic computations. At a very high level, usage of this macro allows one to declaratively:
- Sequence parsers, while short circuiting the rest of the parser if any step fails.
- Bind previous successful results to be used later in the computation.
- Return a composite datastructure using the previous results at the end of the computation.
In other words, just as a normal Rust function usually looks something like this:
fn f() -> (u8, u8, u8) {
let a = read_digit();
let b = read_digit();
launch_missiles();
return (a, b, a + b);
}
A Chomp parser with a similar structure looks like this:
fn f<I: U8Input>(i: I) -> SimpleResult<I, (u8, u8, u8)> {
parse!{i;
let a = digit();
let b = digit();
string(b"missiles");
ret (a, b, a + b)
}
}
And to implement read_digit
we can utilize the map
function to manipulate any success value while preserving any error or incomplete state:
// Standard rust, no error handling:
fn read_digit() -> u8 {
let mut s = String::new();
std::io::stdin().read_line(&mut s).unwrap();
s.trim().parse().unwrap()
}
// Chomp, error handling built in, and we make sure we only get a number:
fn read_digit<I: U8Input>(i: I) -> SimpleResult<I, u8> {
satisfy(i, |c| b'0' <= c && c <= b'9').map(|c| c - b'0')
}
For more documentation, see the rust-doc output.
Example
#[macro_use]
extern crate chomp1;
use chomp1::prelude::*;
#[derive(Debug, Eq, PartialEq)]
struct Name<B: Buffer> {
first: B,
last: B,
}
fn name<I: U8Input>(i: I) -> SimpleResult<I, Name<I::Buffer>> {
parse!{i;
let first = take_while1(|c| c != b' ');
token(b' '); // skipping this char
let last = take_while1(|c| c != b'\n');
ret Name{
first: first,
last: last,
}
}
}
assert_eq!(parse_only(name, "Martin Wernstål\n".as_bytes()), Ok(Name{
first: &b"Martin"[..],
last: "Wernstål".as_bytes()
}));
License
Licensed under either of
- Apache License, Version 2.0 (LICENSE-APACHE or http://www.apache.org/licenses/LICENSE-2.0)
- MIT license (LICENSE-MIT or http://opensource.org/licenses/MIT)
at your option.
Contribution
Unless you explicitly state otherwise, any contribution intentionally submitted for inclusion in the work by you, as defined in the Apache-2.0 license, shall be dual licensed as above, without any additional terms or conditions.
Contact
File an issue here on Github or visit gitter.im/m4rw3r/chomp.
Dependencies
~0.5–8.5MB
~72K SLoC