4 releases (breaking)
0.6.0 | Apr 4, 2023 |
---|---|
0.5.0 | Jul 31, 2021 |
0.4.0 | Mar 8, 2020 |
0.3.0 | Mar 8, 2020 |
0.1.1 |
|
#562 in Filesystem
1,410 downloads per month
Used in 8 crates
(5 directly)
16KB
297 lines
Recursive file-system digest
This library implements a simple but efficient recursive file-system digest algorithm. You have a directory with some content in it, and you'd like a cryptographical digest (hash) of its content.
It was created for the purpose of checksuming source code packages
in crev
, but it is generic and can be used for any other purpose.
Algorithm
Given any digest algorithm H
(a Hash function algorithm),
a RecursiveDigest(H, path)
is:
- for a file:
H("F" || file_content)
- for a symlink:
H("L" || symlink_content)
- for a directory:
H("D" || directory_content)
As you can see a one-letter ASCII prefix is used to make it impossible
to create a file that has the same digest as a directory,
etc. The drawback of this approach is that RecursiveDigest(H, path)
of
a simple file is not the same as just a normal digest of it (H(file_content)
) .
file_content
is just the byte content of a file.
symlink_content
is just the path the symlink is pointing to, as bytes.
directory_content
is created by:
- sorting all entries of a directory by name, in ascending order, using a simple byte-sequence comparison
- for all entries concatenating pairs of:
H(entry_name)
RecursiveDigest(H, entry_path)
If optional additional data extensions is used, the H(entry_name)
above becomes
H(entry_name || 0 || additional data)
. The format and meaning of additional
data is unspecified, but was intendet for fielsystem metadata like file system
permissions and ownership.
Dependencies
~0.6–7.5MB
~59K SLoC