#file #dedup #md5 #scan #duplicate #exec #shortest

app fdedup

Cross platform md5 based file deduplication tool

2 stable releases

1.0.1 Dec 31, 2024
1.0.0 May 18, 2021

#326 in Command line utilities

Download history 2/week @ 2024-09-18 3/week @ 2024-09-25 2/week @ 2024-10-30 3/week @ 2024-11-06 2/week @ 2024-11-13 1/week @ 2024-12-04 2/week @ 2024-12-11 69/week @ 2024-12-25 58/week @ 2025-01-01

130 downloads per month

MIT license

14KB
283 lines

fdedup

fdedup scans the specified directory tree for files with different names, but the same md5 hash. By default it prints a report, in json format, of all found duplicate files and their names. Optionally it can pass duplicates to another program via -exec, or delete all but the shortest name via --keep-shortest.

fdedup 1.0.0

USAGE:
    fdedup [FLAGS] [OPTIONS] <path>

FLAGS:
    -h, --help               Prints help information
    -l, --ignore-symlinks    don't follow symlinks
        --keep-shortest      delete all but the shortest named duplicate
    -p, --pretend            only show what would be done
    -V, --version            Prints version information

OPTIONS:
        --exec <exec>                 pass each duplicate set to program
        --max-dirs <max-dirs>         max simultaneous open directories [default: 256]
        --max-files <max-files>       max simultaneous open files [default: 512]
        --max-symlinks <max-links>    max symlinks to traverse [default: 128]

ARGS:
    <path>

example

$ fdedup proj 2>/dev/null
...
{"digest":[47,188,21,116,50,152,178,14,75,64,19,93,209,168,218,138],"paths":["file0","file1"]}
...

demonstrating -exec

$ fdedup --exec ./print-dup.sh proj 2>/dev/null

print-dup.sh

#! /bin/bash

echo $@

output looks like e.g,

ee97dc2b732f200d616dae66216d57cc file0 file1

one duplicate file per line, starting with the hash.

Dependencies

~6–13MB
~141K SLoC