3 releases (breaking)
0.3.0 | Oct 14, 2024 |
---|---|
0.2.1 | May 25, 2021 |
0.1.0 | May 17, 2021 |
#548 in Parser implementations
29KB
635 lines
Rust Scopes List Notation praser
This is a configurable parser for scopes list notation (SLN) written in rust.
SLN was invented for the Scopes programming language, so also have a look at it.
It is parsed as token lists, which are a very simple representation. For short, every value is either a symbol, which is represented as a string, or a list, which contains other tokens.
Representation
SLN is a simple representation suitable for code and data. It is indentation based, similar to Python, and directly maps a list representation, just like Lisp. There's also support for using brackets instead of indentation based stuff, most useful, if you want to write some stuff on a single line. But it can also be used to refrain from using the indentation based features and use it as a pure s-expression parser, or switch between both.
Benefits
The representation has some benefits. It's a pretty simple and flexible notation, so it can be used for many cases. It could be used as a replacement for other common text representations like XML, JSON, TOML, YAML... The multiline strings are perfect for embedding long texts (like you often have in HTML), or code of other languages (for example the code of a shading language).
Besides that, there are more benefits of using this representation. Since it's parsed as token lists, you can write your program to work with token lists, and then get access to all parsers written for token lists. Or you can also write your own parser. So when creating some programming language (maybe a DSL), you can first use this representation, and when you get an idea for a better representation, you can just switch the parser without the need to the logic. You can even leave both parsers in your code and just allow both syntaxes, at least until you refactored the existing code.
It's also the representation used by Scopes, which might be a good thing if you want to try Scopes.
Usage
First you have to create a new parser. You can also create a preconfigured parser, which matches the scopes syntax. If it doesn't yet, it's a bug and should be fixed. Then you can parse a file or a string (anything, which can be turned into an iterator of characters works) as a list of tokens. Strings will be parsed as a list containing two symbols. The first symbol has the name "symbol", the second contains the string itself.
Configuration
Currently you can just configure these things:
- how many spaces of indentation are required for one nesting?
- are brackets supported and which characters are used?
- will lines containing a single symbol be interpreted as lists or single symbols?
- are comments supported and which character is used to indicate them?
- should strings be prefixed and which prefix should they get?
Also have a look at the documentation.
Dependencies
~240–690KB
~16K SLoC