#csv #parquet #polars #view #sql-query #file-metadata #format

bin+lib polars-view

A fast viewer for Parquet and CSV files, powered by Polars and egui

6 releases (breaking)

new 0.6.0 Mar 8, 2025
0.5.0 Mar 8, 2025
0.4.0 Mar 8, 2025
0.3.0 Mar 7, 2025
0.1.0 Mar 6, 2025

#251 in Database interfaces

Download history 203/week @ 2025-03-01

229 downloads per month

MIT license

95KB
1.5K SLoC

PolarsView

License Rust

A fast viewer for Parquet and CSV files, built with Rust, Polars, and egui.

This project is inspired by and forked from the parqbench project.

Features

  • Fast Loading: Leverages Polars for efficient data loading and processing.
  • Parquet and CSV Support: Handles both Parquet and CSV file formats.
  • SQL Querying: Apply SQL queries to filter and transform data.
  • Interactive Table: Displays data in a sortable, scrollable table.
  • Data Filtering: Provides UI elements for setting CSV delimiters, schema inference length, and decimal precision.
  • Metadata Display: Shows file metadata (column count, row count) and schema information.
  • Asynchronous Operations: Uses Tokio for non-blocking file loading and query execution.
  • Customizable UI: Uses egui for a responsive and customizable user interface.
  • Error Handling: Custom error enum type for robust error handling.
  • Sorting Capabilities: Interactive column sorting (ascending/descending).

Building and Running

  1. Prerequisites:

    • Rust and Cargo (latest stable version recommended).
  2. Clone the Repository:

    git clone https://github.com/claudiofsr/polars-view.git
    cd polars-view
    
  3. Build and Install:

    cargo b -r && cargo install --path=.
    
  4. Run:

    polars-view [path_to_file] [options]
    
    • Replace [path_to_file] with the actual path to your Parquet or CSV file.

    • Use polars-view --help for a list of available options (delimiter, query, table name).

    • Tracing Options (for logging): To enable detailed logging, use the RUST_LOG environment variable:

      RUST_LOG=info polars-view [path_to_file] [options]  # General info logs
      RUST_LOG=debug polars-view [path_to_file] [options] # More detailed logs
      RUST_LOG=trace polars-view [path_to_file] [options] # Very detailed logs (for debugging)
      

      You can also specify the log level for specific modules:

      RUST_LOG=polars_view=debug,polars=info polars-view [path_to_file] [options]
      
    • Examples:

      polars-view data.parquet
      polars-view --delimiter ; data.csv --query "SELECT * FROM mytable WHERE x > 10"
      RUST_LOG=info polars-view data.parquet
      

Usage

  • Open a File:

    • Run the application with a file path as an argument.
    • Use the "File" -> "Open" menu option.
    • Drag and drop a Parquet or CSV file onto the application window.
  • Filtering:

    • Use the "Query" panel to set CSV delimiter, schema inference length, and decimal places.
    • Enter SQL queries in the "SQL Query" field.
    • Click "Apply" to apply the filters.
  • Sorting:

    • Click the column headers in the table to sort by that column (ascending/descending).
  • Metadata:

    • The "Metadata" panel shows file metadata.
    • The "Schema" panel displays the data schema.
  • SQL Examples:

    The interface includes predefined SQL command for easy reference.
    These cover various common filtering and aggregation operations.
    See Polars SQL documentation.

    SELECT * FROM AllData;
    SELECT * FROM AllData WHERE column_name > value;
    SELECT column1, COUNT(*) FROM AllData GROUP BY column1;
    

Dependencies

  • Polars: High-performance DataFrame library.
  • eframe: Immediate-mode GUI library.
  • egui: Immediate-mode GUI library.
  • egui_extras: Complement egui features
  • clap: Command-line argument parser.
  • tokio: Asynchronous runtime.
  • tracing: Application-level tracing framework.
  • thiserror: Library for deriving the Error trait.
  • rfd: Native file dialogs.
  • parquet: Crate for reading and writing parquet.

Contributing

Contributions are welcome! Please open an issue or submit a pull request.

License

This project is licensed under the MIT License.

Dependencies

~55–97MB
~2M SLoC