71 releases (42 stable)
3.3.1 | Oct 26, 2024 |
---|---|
3.2.0 | Sep 7, 2024 |
2.7.2 | Sep 2, 2024 |
1.6.1 | Aug 13, 2024 |
0.5.2 | Jul 29, 2024 |
#166 in Programming languages
354 downloads per month
Used in 3 crates
(via lua_parser)
175KB
3.5K
SLoC
rusty_lr
GLR, LR(1) and LALR(1) parser generator for Rust.
Please refer to docs.rs for detailed example and documentation.
Cargo Features
build
: Enable buildscript tools.fxhash
: In parser table, replacestd::collections::HashMap
withFxHashMap
fromrustc-hash
.tree
: Enable automatic syntax tree construction. This feature should be used on debug purpose only, since it will consume much more memory and time.error
: Enable detailed parsing error messages, forDisplay
andDebug
trait. This feature should be used on debug purpose only, since it will consume much more memory and time.
Features
- GLR, LR(1) and LALR(1) parser generator
- Multiple paths of parsing with GLR parser
- Provides procedural macros and buildscript tools
- readable error messages, both for parsing and building grammar
- customizable reduce action
- resolving conflicts of ambiguous grammar
- regex patterns partially supported
Example
// this define `EParser` struct
// where `E` is the start symbol
lr1! {
%userdata i32; // userdata type
%tokentype char; // token type
%start E; // start symbol
%eof '\0'; // eof token
// token definition
%token zero '0';
%token one '1';
%token two '2';
%token three '3';
%token four '4';
%token five '5';
%token six '6';
%token seven '7';
%token eight '8';
%token nine '9';
%token plus '+';
%token star '*';
%token lparen '(';
%token rparen ')';
%token space ' ';
// conflict resolving
%left [plus star]; // reduce first for token 'plus', 'star'
// context-free grammars
Digit(char): [zero-nine]; // character set '0' to '9'
Number(i32) // type assigned to production rule `Number`
: space* Digit+ space* // regex pattern
{ Digit.into_iter().collect::<String>().parse().unwrap() }; // this will be the value of `Number`
// reduce action written in Rust code
A(f32): A plus a2=A {
*data += 1; // access userdata by `data`
println!( "{:?} {:?} {:?}", A, plus, a2 );
A + a2 // this will be the value of `A`
}
| M
;
M(f32): M star m2=M { M * m2 }
| P
;
P(f32): Number { Number as f32 }
| space* lparen E rparen space* { E }
;
E(f32) : A ;
}
let parser = EParser::new(); // generate `EParser`
let mut context = EContext::new(); // create context
let mut userdata: i32 = 0; // define userdata
let input_sequence = "1 + 2 * ( 3 + 4 )";
// start feeding tokens
for token in input_sequence.chars() {
match context.feed(&parser, token, &mut userdata) {
// ^^^^^ ^^^^^^^^^^^^ userdata passed here as `&mut i32`
// feed token
Ok(_) => {}
Err(e) => {
match e {
EParseError::InvalidTerminal(invalid_terminal) => {
...
}
EParseError::ReduceAction(error_from_reduce_action) => {
...
}
}
println!("{}", e);
// println!( "{}", e.long_message( &parser, &context ) );
return;
}
}
}
context.feed(&parser, '\0', &mut userdata).unwrap(); // feed `eof` token
let res = context.accept(); // get the value of start symbol
println!("{}", res);
println!("userdata: {}", userdata);
Readable error messages (with codespan)
- This error message is generated by the buildscript tool, not the procedural macros.
Visualized syntax tree
- With
tree
feature enabled.
detailed ParseError
message
- With
error
feature enabled.
Syntax
See SYNTAX.md for details of grammar-definition syntax.
- Bootstrap: rusty_lr syntax is written in rusty_lr itself.
Contribution
- Any contribution is welcome.
- Please feel free to open an issue or pull request.
License (Since 2.8.0)
Either of
- MIT license (LICENSE-MIT or http://opensource.org/licenses/MIT)
- Apache License, Version 2.0 (LICENSE-APACHE or http://www.apache.org/licenses/LICENSE-2.0)
Other Examples
Dependencies
~0–6.5MB
~41K SLoC