42 breaking releases
Uses new Rust 2024
new 0.43.0 | Apr 10, 2025 |
---|---|
0.41.0 | Apr 9, 2025 |
0.32.0 | Mar 31, 2025 |
#188 in Encoding
4,705 downloads per month
315KB
3.5K
SLoC
PolarsView
A fast and interactive viewer for CSV, JSON (including Newline-Delimited JSON - NDJSON), and Apache Parquet files, built with Polars and egui.
This project is inspired by and initially forked from the parqbench project.
Features
- Fast Data Handling: Uses the Polars DataFrame library for efficient data loading, processing, and querying.
- Multiple File Format Support:
- Load data from: CSV, JSON, NDJSON (Newline-Delimited JSON), Parquet.
- Save data as: CSV, JSON, NDJSON, Parquet (via "Save As..." [Ctrl+A]).
- Interactive Table View:
-
Supports sorting by multiple columns simultaneously: Click column header icons to sort the entire DataFrame asynchronously. The order of clicks determines sort precedence. The 5-state cycle for each column controls direction and null placement:
↕
: Not Sorted⏷
: Descending, Nulls First⏶
: Ascending, Nulls First⬇
: Descending, Nulls Last⬆
: Ascending, Nulls Last↕
: Back to Not Sorted
(Numbers indicate sort precedence if multiple columns are sorted)
-
Customizable Header: Toggle visual style ("Enhanced Header"), adjust vertical padding ("Header Padding").
-
Column Sizing: Choose automatic content-based sizing ("Auto Col Width": true) or faster fixed initial widths ("Auto Col Width": false). Manually resize columns by dragging separators.
-
- SQL Querying: Filter and transform data using Polars' SQL interface. Execute queries asynchronously via the "Query" panel.
- String Column Number Normalization (CLI): Use the
--apply-regex
(-a
) argument to select string columns (via wildcard*
or a^...$
regex pattern matching column names) containing European-style numbers (e.g., '1.234,56') and convert them to standard Float64 format (e.g., 1234.56) on load. - Configuration Panels (Side Bar):
- Info: Displays file dimensions (rows, columns).
- Format: Set text alignment, float decimal places, column width strategy, header style, and header padding.
- Query: Configure SQL query, add optional row index column (with custom name/offset), normalize columns, null column removal, schema inference rows (CSV/JSON/NDJSON), CSV delimiter, custom CSV null values, and view SQL examples.
- Columns: Shows column names and Polars data types. Right-click a column name to copy it.
- Asynchronous Operations: Utilizes Tokio for non-blocking file I/O, sorting, and SQL execution, keeping the UI responsive. Shows a spinner during processing.
- Drag and Drop: Load files by dropping them onto the application window.
- Robust Error Handling: Displays errors (file loading, parsing, SQL, etc.) in a non-blocking notification window.
- Theming: Switch between Light and Dark themes via the menu bar.
- Persistence: Remembers window size and position between sessions.
Building and Running
-
Prerequisites:
- Rust and Cargo (latest stable version recommended, minimum version 1.86 and edition 2024).
-
Clone the Repository:
git clone https://github.com/claudiofsr/polars-view.git cd polars-view
-
Build and Install (Release Mode):
# Build with default features (uses 'format-simple') cargo b -r && cargo install --path=. # --- OR Build with Specific Features --- # Example: Build with 'format-special' (formats 'Alíq'/'Aliq' columns differently) cargo b -r && cargo install --path=. --features format-special
This compiles optimized code and installs the
polars-view
binary to~/.cargo/bin/
. -
Run:
polars-view [path_to_file] [options]
-
If
[path_to_file]
is provided (CSV, JSON, NDJSON, Parquet), it's loaded on startup. -
Run
polars-view --help
for command-line options (-a
/--apply-regex
,--delimiter
,-q
,--table-name
,--null-values
,--remove-null-cols
). -
Logging/Tracing: Control log detail using the
RUST_LOG
environment variable (values:error
,warn
,info
,debug
,trace
). Remember toexport
it before running:# Example: Run with debug level logging export RUST_LOG=debug polars-view data.parquet
-
Examples:
polars-view sales_data.parquet polars-view --delimiter="|" transactions.csv --null-values="N/A,-" polars-view data.csv -q "SELECT category, SUM(value) AS total FROM AllData GROUP BY category" # Normalize Euro numbers in columns matching "^Value.*$" polars-view data.csv -a "^Value.*$" # Normalize Euro numbers in ALL string columns (Use with caution!) polars-view data.csv -a "*" # Use backticks/quotes for names with spaces/special chars polars-view items.csv -q "SELECT \`Item Name\`, Price FROM AllData WHERE Price > 100.0" polars-view logs.ndjson -q 'SELECT timestamp, message FROM AllData WHERE level = "ERROR"' # Remove all-null columns on load polars-view big_dataset.parquet --remove-null-cols
-
Usage Guide
- Opening Files: Use the command line, "File" > "Open File..." (Ctrl+O), or drag & drop.
- Viewing Data: Scroll the table. Click header icons to apply/cycle sorting (supports multiple columns; order matters). Drag header separators to resize columns.
- Configuring View & Data: Use left-side panels ("Info", "Format", "Query", "Columns"). Format changes update the view efficiently; Query/Filter changes trigger an asynchronous data reload/requery.
- Applying SQL: Enter query in "Query" panel (default table:
AllData
). Click "Apply SQL Commands". See examples or Polars SQL docs. - Saving Data:
- Save (Ctrl+S): Overwrites the original file path.
- Save As... (Ctrl+A): Saves current view to a new file. Choose format (CSV, JSON, NDJSON, Parquet) via dialog.
- Exiting: Use "File" > "Exit" or close the window.
Core Dependencies
- GUI Framework:
eframe
,egui
,egui_extras
- Data Handling:
polars
(with features likelazy
,csv
,json
,parquet
,sql
) - Asynchronous Runtime:
tokio
(with features likert
,sync
,rt-multi-thread
) - Command Line:
clap
,anstyle
- File Dialogs:
rfd
- Logging/Diagnostics:
tracing
,tracing-subscriber
- Utilities:
thiserror
,cfg-if
,env_logger
(non-wasm)
License
This project is licensed under the MIT License.
Dependencies
~54–96MB
~2M SLoC