1 unstable release

Uses new Rust 2024

new 0.1.0 Mar 30, 2025

#410 in #table


Used in tpchgen-cli

Apache-2.0

3.5MB
4K SLoC

TPCH DataGenerator in Arrow format

This crate generates TPCH data directly into Apache Arrow format using the arrow crate

Example usage:

See docs.rs: TODO once this is published

Testing:

This crate ensures correct results using two methods.

  1. Basic functional tests are in Rust doc tests in the source code (cargo test --doc)
  2. The reparse integration test ensures that the Arrow generators produce the same results as parsing the original tbl format (cargo test --test reparse)

Contributing:

Please see CONTRIBUTING.md for more information on how to contribute to this project.


lib.rs:

Generate TPCH data as Arrow RecordBatches

This crate provides generators for TPCH tables that directly produces Arrow RecordBatches. This is significantly faster than generating TBL or CSV files and then parsing them into Arrow.

Example

// Create a SF=1 generator for the LineItem table
let generator = LineItemGenerator::new(1.0, 1, 1);
let mut arrow_generator = LineItemArrow::new(generator)
  .with_batch_size(10);
// The generator is a Rust iterator, producing RecordBatch
let batch = arrow_generator.next().unwrap();
// compare the output by pretty printing it
let formatted_batches = pretty_format_batches(&[batch]).unwrap().to_string();
assert_eq!(formatted_batches.lines().collect::<Vec<_>>(), vec![
  "+------------+-----------+-----------+--------------+------------+-----------------+------------+-------+--------------+--------------+------------+--------------+---------------+-------------------+------------+-------------------------------------+",
  "| l_orderkey | l_partkey | l_suppkey | l_linenumber | l_quantity | l_extendedprice | l_discount | l_tax | l_returnflag | l_linestatus | l_shipdate | l_commitdate | l_receiptdate | l_shipinstruct    | l_shipmode | l_comment                           |",
  "+------------+-----------+-----------+--------------+------------+-----------------+------------+-------+--------------+--------------+------------+--------------+---------------+-------------------+------------+-------------------------------------+",
  "| 1          | 155190    | 7706      | 1            | 17.00      | 21168.23        | 0.04       | 0.02  | N            | O            | 1996-03-13 | 1996-02-12   | 1996-03-22    | DELIVER IN PERSON | TRUCK      | egular courts above the             |",
  "| 1          | 67310     | 7311      | 2            | 36.00      | 45983.16        | 0.09       | 0.06  | N            | O            | 1996-04-12 | 1996-02-28   | 1996-04-20    | TAKE BACK RETURN  | MAIL       | ly final dependencies: slyly bold   |",
  "| 1          | 63700     | 3701      | 3            | 8.00       | 13309.60        | 0.10       | 0.02  | N            | O            | 1996-01-29 | 1996-03-05   | 1996-01-31    | TAKE BACK RETURN  | REG AIR    | riously. regular, express dep       |",
  "| 1          | 2132      | 4633      | 4            | 28.00      | 28955.64        | 0.09       | 0.06  | N            | O            | 1996-04-21 | 1996-03-30   | 1996-05-16    | NONE              | AIR        | lites. fluffily even de             |",
  "| 1          | 24027     | 1534      | 5            | 24.00      | 22824.48        | 0.10       | 0.04  | N            | O            | 1996-03-30 | 1996-03-14   | 1996-04-01    | NONE              | FOB        |  pending foxes. slyly re            |",
  "| 1          | 15635     | 638       | 6            | 32.00      | 49620.16        | 0.07       | 0.02  | N            | O            | 1996-01-30 | 1996-02-07   | 1996-02-03    | DELIVER IN PERSON | MAIL       | arefully slyly ex                   |",
  "| 2          | 106170    | 1191      | 1            | 38.00      | 44694.46        | 0.00       | 0.05  | N            | O            | 1997-01-28 | 1997-01-14   | 1997-02-02    | TAKE BACK RETURN  | RAIL       | ven requests. deposits breach a     |",
  "| 3          | 4297      | 1798      | 1            | 45.00      | 54058.05        | 0.06       | 0.00  | R            | F            | 1994-02-02 | 1994-01-04   | 1994-02-23    | NONE              | AIR        | ongside of the furiously brave acco |",
  "| 3          | 19036     | 6540      | 2            | 49.00      | 46796.47        | 0.10       | 0.00  | R            | F            | 1993-11-09 | 1993-12-20   | 1993-11-24    | TAKE BACK RETURN  | RAIL       |  unusual accounts. eve              |",
  "| 3          | 128449    | 3474      | 3            | 27.00      | 39890.88        | 0.06       | 0.07  | A            | F            | 1994-01-16 | 1993-11-22   | 1994-01-23    | DELIVER IN PERSON | SHIP       | nal foxes wake.                     |",
  "+------------+-----------+-----------+--------------+------------+-----------------+------------+-------+--------------+--------------+------------+--------------+---------------+-------------------+------------+-------------------------------------+"
]);

Dependencies

~15–22MB
~296K SLoC