5 releases (3 breaking)
0.6.1 | Oct 2, 2023 |
---|---|
0.5.1 | Sep 20, 2023 |
0.5.0 | Sep 19, 2023 |
0.4.0 | Aug 31, 2023 |
0.3.0 | Aug 11, 2023 |
#529 in Filesystem
39 downloads per month
325KB
8K
SLoC
Ghee: that thin layer of data change management over Btrfs, using Linux extended attributes (xattrs)
The tastiest way to manage data using Linux extended attributes (xattrs), written in pure Rust and made of delicious, fibrous Open Source code.
Ghee provides tools for manipulating xattrs on individual files as well as for working with the filesystem as a document database where the filesystem paths act as primary keys and extended attributes act as fields.
Data in Ghee tables can be managed using Git-style commits, implemented as Btrfs subvolumes and read-only snapshots.
In this way Ghee leverages the copy-on-write (CoW) nature of Btrfs to efficiently store deltas, even on large binary blobs.
License
This software is licensed under GPL version 3 only.
Conventions
Extended attribute names are parsed in a consistent manner by Ghee. Any xattr not preceded by the trusted
, security
, system
, or user
namespace will have the user
namespace by default. For example, xattr trusted.uptime
remains as is, while uptime
would become
user.uptime
.
Extended attribute values are parsed as f64
numbers if possible; otherwise, they are interpeted as strings.
REPL
Running ghee
with no arguments will enter a read-eval-print-loop (REPL), allowing for fluent command input:
$ ghee
Ghee 0.6.0
ghee$ set ./test -s test=1
etc.
Subcommands
Ghee operates through a set of subcommands, each with a primary function. Run ghee --help
to list them,
and ghee $SUBCMD --help
to get usage information for each subcommand.
Examples of each subcommand follow:
Move
Moves xattr values from one path to another.
ghee mv path1.txt path2.txt
: move all xattrs from path1.txt to path2.txtghee mv -f id path1.txt path2.txt
: move xattrid
from path1.txt to path2.txtghee mv -f id -f url path1.txt path2.txt
: move xattrsid
andurl
from path1.txt to path2.txt
Copy
Copies xattr values from one path to another.
ghee cp path1.txt path2.txt
: copy all xattrs from path1.txt to path2.txtghee cp -f id path1.txt path2.txt
: copy xattrid
from path1.txt to path2.txtghee cp -f id -f url path1.txt path2.txt
: copy xattrsid
andurl
from path1.txt to path2.txt
Remove
Removes xattr values, recursively by default.
ghee rm path.txt
: remove all xattrs on path.txtghee rm dir
: remove all xattrs from dir and all descendantsghee rm --flat dir
: remove all xattrs from dir only, not its descendantsghee rm -f id path.txt
: remove xattrid
from path.txtghee rm -f id -f url path1.txt path2.txt path3.txt
: remove xattrsid
andurl
from path1.txt, path2.txt, and path3.txtghee rm -f name dir
: remove xattrname
from dir and all its descendants
Set
Sets xattr values, recursively by default.
ghee set -s id=123 path1.txt
: set xattrid
to value123
on path1.txtghee set -s name=Jama dir
: set xattrname
to value"Jama"
on dir and all descendantsghee set -s name=Amira --flat dir
: set xattrname
to value"Amira"
on dir only, not its descendantsghee set -s id=123 -s url=http://example.com path1.txt path2.txt path3.txt
: set xattrid
to value123
and xattrurl
to value"http://example.com"
on path1.txt, path2.txt, and path3.txt
Get
Recursively get and print xattr values for one or more paths.
By default, the get
subcommand outputs a tab-separated table with a column order of path
, field
, value
.
The value bytes are written to stdout as-is without decoding.
This excludes the user.ghee
prefix unless -a --all
is passed.
To opt out of the recursive default, use --flat
.
ghee get dir
: print all xattrs for directorydir
and all descendant files and directories, as raw (undecoded) TSVghee get -f id path1.txt
: print xattrid
and its value on path1.txt as raw (undecoded) TSVghee get -f id -f url path1.txt path2.txt path3.txt
: print xattrsid
andurl
and their respective values on path1.txt, path2.txt, and path3.txt as raw (undecoded) TSV
The get
command can also output JSON - in which case values are decoded as UTF-8, filling in a default codepoint when decoding fails:
ghee get -j --flat dir
: print all xattrs for directorydir
itself but not its descendants, as UTF-8 decoded JSONghee get -j -f id path1.txt
: print xattrid
and its value on path1.txt as UTF-8 decoded JSONghee get -j -f id -f url path1.txt path2.txt path3.txt
: print xattrsid
andurl
and their respective values on path1.txt, path2.txt, and path3.txt as JSON
By adding --where
(or -w
), SQL WHERE-style clauses can be provided to select which files to include in the output. For example,
ghee get -w age >= 65 ./patients
will select all files under directory ./patients
whose user.age
attribute is 65 or greater.
Nested indices are always ignored in get
output, though they will be used as appropriate to shortcut traversal when WHERE-style
predicates are specified.
Init
Initializes a directory as a table with a specified primary key, optionally inserting records from JSON where each line is
parsed independently---see people.json
in the repository for an example.
Examples:
ghee init -k name ./people
: marks the./people
directory as a table with primary key ofname
ghee init -k state -k id ./people-by-state-and-id
: marks the./people-by-state-and-id
directory as a table with a compound primary key of [state
,id
].ghee init -k sauce ./pizza < ./pizzas.json
: marks the./pizza
directory as a table with primary keysauce
, importing records from./pizzas.json
Create
Exactly like init
, but creates the directory first, or errors if it already exists.
Insert
Inserts JSON-formatted records into a table.
Records are read one per line from stdin.
ghee ins ./people < ./people.json
: inserts the records from./people.json
into the table at./people
, indexed by its primary keyghee ins ./people ./people.json
: same as the above, but not depending on the shell for redirection
Delete
Deletes records from a table.
They are unlinked from all table indices.
The records to be deleted are specified by providing either the components of the primary key or SQL-style WHERE clauses.
ghee del ./people Von
: because the table's primary key isname
, deletes the record wherename=Von
from./people
and all indices.ghee del ./people -w name=Von
: deletes./people/Von
as above, unlinking from all indices.
Index
Indexes a table.
When Ghee acts on a directory as if it were a database table, each file acts as a relational "record" with the primary key inferred from its subpath under the table directory.
Each file's extended attributes act as the relational attributes.
Table directories created by Ghee also contain a special xattr user.ghee.tableinfo
which stores the primary key and related indices
(including itself) of a table.
If no index location is provided, it will be placed in a default path under the table being indexed.
Examples:
-
ghee idx -k name ./people ./people-by-name
: recursively reindex the contents of./people
into a new directory./people-by-name
with primary key coming from xattrname
and files hardlinked to the corresponding files in./people
.That means the
./people-by-name
directory's files will have filenames taken from the names of the people as defined in xattrname
.The
user.ghee.tableinfo
xattr for./people
records./people-by-name
as a related index, and the reciprocal is true as well: theuser.ghee.tableinfo
xattr for./people-by-name
records./people
as a related index.Queries such as
get
anddel
will be opportunistically accelerated using available indices. -
ghee idx -k region -k name -s ./people-by-name ./people-by-region-and-name
: recursively reindex the contents of./people-by-name
into a new directory./people-by-region-and-name
with primary key being the compound of xattrregion
and xattrname
(in that order) and files hardlinked to the corresponding files in./people
, resolved via the hardlinks in./people-by-name
.The
user.ghee.tableinfo
xattrs of both directories will be updated to reflect their relationship.
List
Like the ls
command, lists directory contents, but annotated from Ghee's point of view.
Each path is marked as either a table or a record. For tables, the primary key is given.
ghee ls
: lists the current directory's contentsghee ls example
: lists the contents of ./example
Commit
Stores the current state of the table in a Btrfs snapshot, identified by a UUID.
Optionally, a message describing the changes made since the last snapshot (if any) can be provided.
ghee commit -m "Update README.md"
The UUID of the commit is outputted.
Log
Displays past commits.
ghee log
: Lists all past commits in the current table
Touch
Similar to the Unix touch
command, creates an empty file at the specified path;
if the path is part of a Ghee table, xattrs are inferred from the path and written
to the new file.
This is a convenient way to add new records to new tables.
With -p / --parents
, parent directories will be created.
ghee touch /pizza/pepperoni
: creates an empty file at/pizza/pepperoni
, setting xattrtopping
topepperoni
because the key of the/pizza
table istopping
.
Restore
Restores paths to their state in the HEAD
commit.
ghee restore README.md
Reset
Resets all files in the table to their state in a specified commit.
ghee reset add133b4-f58b-a64e-992a-46f983a0e7ed
Dependencies
~9–21MB
~322K SLoC