1 unstable release
0.1.0 | Jul 22, 2021 |
---|
#984 in Command-line interface
25 downloads per month
52KB
813 lines
Lexopt
Lexopt is an argument parser for Rust. It tries to have the simplest possible design that's still correct. It's so simple that it's a bit tedious to use.
Lexopt is:
- Small: one file, no dependencies, no macros. Easy to audit or vendor.
- Correct: standard conventions are supported and ambiguity is avoided. Tested and fuzzed.
- Pedantic: arguments are returned as
OsString
s, forcing you to convert them explicitly. This lets you handle badly-encoded filenames. - Imperative: options are returned as they are found, nothing is declared ahead of time.
- Annoyingly minimalist: only the barest necessities are provided.
- Unhelpful: there is no help generation and error messages often lack context.
Example
struct Args {
thing: String,
number: u32,
shout: bool,
}
fn parse_args() -> Result<Args, lexopt::Error> {
use lexopt::prelude::*;
let mut thing = None;
let mut number = 1;
let mut shout = false;
let mut parser = lexopt::Parser::from_env();
while let Some(arg) = parser.next()? {
match arg {
Short('n') | Long("number") => {
number = parser.value()?.parse()?;
}
Long("shout") => {
shout = true;
}
Value(val) if thing.is_none() => {
thing = Some(val.into_string()?);
}
Long("help") => {
println!("Usage: hello [-n|--number=NUM] [--shout] THING");
std::process::exit(0);
}
_ => return Err(arg.unexpected()),
}
}
Ok(Args {
thing: thing.ok_or("missing argument THING")?,
number,
shout,
})
}
fn main() -> Result<(), lexopt::Error> {
let args = parse_args()?;
let mut message = format!("Hello {}", args.thing);
if args.shout {
message = message.to_uppercase();
}
for _ in 0..args.number {
println!("{}", message);
}
Ok(())
}
Let's walk through this:
- We start parsing with
Parser::from_env()
. - We call
parser.next()
in a loop to get all the arguments until they run out. - We match on arguments.
Short
andLong
indicate an option. - To get the value that belongs to an option (like
10
in-n 10
) we callparser.value()
.- This returns a standard
OsString
. - For convenience,
use lexopt::prelude::*
adds a.parse()
method, analogous tostr::parse
.
- This returns a standard
Value
indicates a free-standing argument. In this case, a filename.if thing.is_none()
is a useful pattern for positional arguments. If we already foundthing
we pass it on to another case.- It also contains an
OsString
.- The standard
.into_string()
method can decode it into a plainString
.
- The standard
- If we don't know what to do with an argument we use
return Err(arg.unexpected())
to turn it into an error message. - Strings can be promoted to errors for custom error messages.
This covers almost all the functionality in the library. Lexopt does very little for you.
For a larger example with useful patterns, see examples/cargo.rs
.
Command line syntax
The following conventions are supported:
- Short options (
-q
) - Long options (
--verbose
) --
to mark the end of options=
to separate long options from values (--option=value
)- Spaces to separate options from values (
--option value
,-f value
) - Unseparated short options (
-fvalue
) - Combined short options (
-abc
to mean-a -b -c
)
These are not supported:
-f=value
for short options- Options with optional arguments (like GNU sed's
-i
, which can be used standalone or as-iSUFFIX
) - Single-dash long options (like find's
-name
) - Abbreviated long options (GNU's getopt lets you write
--num
instead of--number
if it can be expanded unambiguously)
Unicode
This library supports unicode while tolerating non-unicode arguments.
Short options may be unicode, but only a single codepoint. (If you need whole grapheme clusters you can use a long option. If you need normalization you're on your own, but it can be done.)
Options can be combined with non-unicode arguments. That is, --option=���
will not cause an error or mangle the value. This is surprisingly tricky to support: see os_str_bytes
.
Options themselves are patched as by String::from_utf8_lossy
if they're not valid unicode. That typically means you'll raise an error later when they're not recognized.
Why?
For a particular application I was looking for a small parser that's pedantically correct. There are other compact argument parsing libraries, but I couldn't find one that handled OsString
s and implemented all the fiddly details of the argument syntax faithfully.
This library may also be useful if a lot of control is desired, like when the exact argument order matters or not all options are known ahead of time. It could be considered more of a lexer than a parser.
Why not?
This library may not be worth using if:
- You don't care about non-unicode arguments
- You don't care about exact compliance and correctness
- You don't care about code size
- You do care about great error messages
- You hate boilerplate
See also
clap
/structopt
: very fully-featured. The only other argument parser for Rust I know of that truly handles invalid unicode properly, if used right. Large.argh
andgumdrop
: much leaner, yet still convenient and powerful enough for most purposes. Panic on invalid unicode.argh
adheres to the Fuchsia specification and therefore does not support--option=value
and-ovalue
, only--option value
and-o value
.
pico-args
: slightly smaller than lexopt and easier to use (but less rigorous).ap
: I have not used this, but it seems to support iterative parsing while being less bare-bones than lexopt.- libc's
getopt
.
pico-args has a nifty table with build times and code sizes for different parsers. I've rerun the tests and added lexopt (using the program in examples/pico_test_app.rs
):
null | lexopt | pico-args | clap | gumdrop | structopt | argh | |
---|---|---|---|---|---|---|---|
Binary overhead | 0KiB | 14.5KiB | 13.5KiB | 372.8KiB | 17.7KiB | 371.2KiB | 16.8KiB |
Build time | 0.9s | 1.7s | 1.6s | 13.0s | 7.5s | 17.0s | 7.5s |
Number of dependencies | 0 | 0 | 0 | 8 | 4 | 19 | 6 |
Tested version | - | 0.1.0 | 0.4.2 | 2.33.3 | 0.8.0 | 0.3.22 | 0.1.4 |
(Tests were run on x86_64 Linux with Rust 1.53 and cargo-bloat 0.10.1.)