#data-fusion #query #data-query #cli #command

bin+lib dfkit

A command-line toolkit for querying and transforming CSV, JSON, Parquet, and Avro data

1 unstable release

Uses new Rust 2024

new 0.1.0 Apr 17, 2025

#2347 in Command line utilities

MIT license

29KB
364 lines

CI

dfkit

dfkit is an extensive suite of command-line functions to easily view, query, and manipulate CSV, Parquet, JSON, and Avro files. Written in Rust and powered by Apache Arrow and Apache DataFusion. Currently a work in progress.

Commands

SUBCOMMANDS:
    cat         Concatenate multiple files or all files in a directory
    convert     Convert file format (CSV, Parquet, JSON)
    count       Count the number of rows in a file
    describe    Show summary statistics for a file
    help        Prints this message or the help of the given subcommand(s)
    query       Run a SQL query on a file
    reverse     Reverse the order of rows
    schema      Show schema of a file
    sort        Sort rows by one or more columns
    split       Split a file into N chunks
    view        View the contents of a file

Examples

Installation

Dependencies

~76MB
~1.5M SLoC