#combinator #parser-combinator #branch #pattern #string #second #chumsky

chumsky-branch

branch combinator for the chumsky parsing library

2 unstable releases

0.2.0 Feb 19, 2023
0.1.1 Aug 18, 2022
0.1.0 Aug 18, 2022

#195 in Parser tooling

MIT license

15KB
318 lines

chumsky-branch License: MIT chumsky-branch on crates.io chumsky-branch on docs.rs Source Code Repository

This crate defines three parsing combinators for the chumsky parsing library:

  • not_starting_with: This combinator takes a list of patterns, and matches the shortest string from the input that diverges from all patterns.
  • not_containing: This combinator takes a list of patterns, and any string that does not contain any of the patterns.
  • branch: This combinator allows branching into a parser. Each branch defines two parsers. When the first parser matches, it chooses that branch and that branch only, even if the second parser fails. The second parser is then used to produce the output type. You can combine as many branches as you want (similar to if else). Then, you have to define an else branch which just takes a String and needs to produce output from that. Useful if you want to parse verbatim input plus some syntax.

Example

use chumsky::prelude::*;
use chumsky_branch::prelude::*;

#[derive(Debug, Eq, PartialEq)]
enum Token {
	Placeholder(String),
	Comment(String),
	Verbatim(String)
}

impl Token {
	fn lexer() -> impl Parser<char, Self, Error = Simple<char>> {
		branch(
			"{{",
			text::ident().then_ignore(just("}}")).map(Self::Placeholder)
		)
		.or_branch(
			"/*",
			not_containing(["*/"])
				.then_ignore(just("*/"))
				.map(Self::Comment)
		)
		.or_else(Self::Verbatim)
	}
}

fn lexer() -> impl Parser<char, Vec<Token>, Error = Simple<char>> {
	Token::lexer().repeated().then_ignore(end())
}

let input = "/* Greet the user */Hello {{name}}!";
assert_eq!(&lexer().parse(input).unwrap(), &[
	Token::Comment(" Greet the user ".to_owned()),
	Token::Verbatim("Hello ".to_owned()),
	Token::Placeholder("name".to_owned()),
	Token::Verbatim("!".to_owned())
]);

Dependencies

~2.5MB
~36K SLoC