8 stable releases
1.2.0 | Sep 17, 2023 |
---|---|
1.1.3 | Jun 25, 2023 |
1.0.2 | Jun 25, 2023 |
1.0.1 | Jun 24, 2023 |
#2326 in Parser implementations
Used in nu-plugin-bexpand
35KB
795 lines
bexpand
Bash-style brace expansion in Rust.
Functionality
- Plain strings are just plain strings:
abcd
- Special characters need to be escaped with a preceding
\
to be considered as regular characters:ab\{cd
- Outside of a list, special characters are
{}\
. - Inside of a list, special characters are
,{}\
. - Inside of a sequence, special characters are
,.{}\
.
- Outside of a list, special characters are
- Any character preceded by a
\
will be taken literally, even if it's redundant.\n
, for example, representsn
, not a newline character.
- A List is a brace-enclosed comma-separated list of Expressions:
{a,b,c}
,{a,,c}
,{}
,{,}
- An empty List is still a List.
- Lists may have empty or duplicate fields.
- Lists produce all the values of their contained Expressions.
- A Sequence is either a Numeric Sequence or a Character Sequence.
- A Numeric sequence is in the form
{<spec><start>..<end>[..<stride>]}
<spec>
is a possibly-empty set of format specifier characters:=
means to expand each item with leading zeroes to the width of the longest character width of<start>
and<end>
.
<start>
and<end>
are signed 64-bit integers. If<end>
is less than<start>
, the sequence will count downwards.<stride>
is an optional non-negative increment number, to count by increments of more than 1.- The default
<stride>
is1
. - A zero
<stride>
is always normalized to1
to prevent infinite looping. - A stride may cause the endpoint to be skipped, and even the numeric size
limit to be hit without error.
'{9223372036854775806..9223372036854775807..1000}'
just produces9223372036854775806
, not an error.
- The default
- A Character sequence is in the form
{<start>..<end>[..<stride>]}
<start>
and<end>
are unicode characters to produce codepoints for in order. If<end>
is less than<start>
, the sequence will cycle downwards. If this range would end up producing a surrogate codepoint, an error is given for each instead.- An error does not terminate iteration. If an error is returned, following
iterations that move out of the surrogate range may still produce good
values.
- This could be used at some point to allow optional replacement characters, but I don't see a value in that over just throwing an error at this time.
- An error does not terminate iteration. If an error is returned, following
iterations that move out of the surrogate range may still produce good
values.
<stride>
is an optional non-negative increment number, to count by increments of more than 1.- The default
<stride>
is1
. - A zero
<stride>
is always normalized to1
to prevent infinite looping. - A stride may cause the endpoint to be skipped, and even the numeric size
limit to be hit without error.
'{a..z..1114111}'
just producesa
, not an error
- The default
- An Expression contains a sequence of Plain strings, Lists, and Sequences.
- An Expression produces the cartesian product of all its items:
{a,b}c{d,e}f{g..i}
produces["acdfg","acdfh","acdfi","acefg","acefh","acefi","bcdfg","bcdfh","bcdfi","bcefg","bcefh","bcefi"]
- Expression order is produced in lexicographic order, keyed by the index of each sub-expression.
- An Expression produces the cartesian product of all its items:
- Expressions and Lists may nest arbitrarily.
'{a,{b,,c{\,..\.}}{f..d..2}}'
produces["a","bf","bd","f","d","c,f","c,d","c-f","c-d","c.f","c.d"]
Differences from Bash
This does not 100% conform to Bash's style in the following ways:
- There are patterns considered ill-formed and will throw an error in bexpand. It will not try to truck along if a bad pattern is found.
- Braces are special characters and are not allowed without either being
correctly formed or being escaped. In Bash
a{b,c}d}e
expands toabd}e acd}e
anda{b{c,d}e
expands toa{bce a{bde
. In bexpand, both are errors. - Empty and single-component lists are considered acceptable in bexpand. In
Bash,
a{}b
anda{b}c
are both literally repeated by the shell. In bexpand, these expand toab
andabc
. - bexpand allows character sequences to iterate any valid unicode codepoints.
{🥰..🥴..2}
is a valid character sequence, as is{\{..\.}
, and{9..A}
. Technically,{\0..\9}
is valid as well, and will be treated as a character sequence, though it expands to the exact same thing as a numeric sequence of the same form. Anything that would generate an illegal unicode codepoint will generate an error. - The width specifier is done with an equal sign at the beginning of the
opening brace instead, so in Bash,
{001..100}
is instead done in bexpand as{=1..100}
. This is to allow things like{=-5..10}
, which is impossible to express in Bash.
License
Copyright 2023 Taylor Richberger
Published under the terms of the Mozilla Public License Version 2.0.
Dependencies
~1.5MB
~28K SLoC