2 stable releases

1.0.1 Nov 22, 2024

#626 in Parser implementations

GPL-3.0-or-later

405KB
2.5K SLoC

 █▀▀ █▀▀ █ █   █▀▀ █▀█ █▄ █ █ █ █▀▀ █▀█ ▀█▀ █▀▀ █▀█
 █▄▄ ▄▄█ ▀▄▀   █▄▄ █▄█ █ ▀█ ▀▄▀ ██▄ █▀▄  █  ██▄ █▀▄

A tool to convert a CSV file into a new format

crates badge crates docs build status

Description

csv_converter is a Rust-based CLI application designed to convert CSV files into any format you want driven by a powerful config. At The Working Party we use this tool to streamline the process of preparing bulk product data for Shopify imports, making it easier to import massive inventories.

Features

  • Easy Conversion: Transform standard CSV files into a CSVs layout you need with a config file easily written with any spreadsheet processor.
  • Fast Processing: Leverages Rust's performance to handle very large files efficiently.
  • No dependencies: This app uses no external crates.

How does it work?

Imagine you scrape a website with your favorite scraper and now have this huge spreadsheet with a lot of data. The input CSV

View the raw CSV
URL,name,image1,image2,image3,SKU,description,data1,data2,variant1,variant2
https://myshop.tld/product/berta2-green-holster,Berta2,https://cdn.myshop.tld/img1.jpg,https://cdn.myshop.tld/img2.jpg,https://cdn.myshop.tld/img3.jpg,berta2,Berta2 is the new and improved berta,,,black,green
https://myshop.tld/product/susan-organic,Susan,https://cdn.myshop.tld/img1.jpg,https://cdn.myshop.tld/img2.jpg,https://cdn.myshop.tld/img3.jpg,susan,Buy Susan,,,organic,toxic

These spreadsheet can be very large and contain many cells that you may not even need. Others need to be reshuffled or split into it's own line etc.

A good spreadsheet for the above data could be this sheet: The output CSV

View the raw CSV
Handle,Command,Name,Description,Variant ID,Variant Command,Option1 Name,Option1 Value
berta2,NEW,Berta2,Berta2 is the new and improved berta,,MERGE,Material,black
berta2,MERGE,,,,MERGE,Material,green
susan,NEW,Susan,Buy Susan,,MERGE,Material,organic
berta2,MERGE,,,,MERGE,Material,toxic

You have to split off each line into two and make sure you select the right items with the right headlines.

With csv_converter you can do this by creating a config spreadsheet like this: The config CSV

View the raw CSV
Handle,Command,Name,Description,Variant ID,Variant Command,Option1 Name,Option1 Value
<cell6>,NEW,<cell2>,<cell7>,,MERGE,Material,<cell10>
<cell6>,MERGE,,,,MERGE,Material,<cell11>

The first line of the config is the heading you like. No changes will be made to it.

All lines after are free for you to allocate. You reference cells by using the <cell[x]> token. The reference is pointing to a single line from your import. Each line from you input CSV file will be processed via this config.

Show more ``` ┌────────────────────────────────────────────────────────────┐ │ Input.csv │ ├───────┬───────┬───────────┬─────────┬──────────────┬───────┤ │Heading│Heading│ Heading │ Heading │ Heading │Heading│ ├───────┼───────┼───────────┼─────────┼──────────────┼───────┤ │││ │ │ ││ ├───────┼───────┼───────────┼─────────┼──────────────┼───────┤ │ ... │ ... │ ... │ ... │ ... │ ... │ ├───────┼───────┼───────────┼─────────┼──────────────┼───────┤ │ ... │ ... │ ... │ ... │ ... │ ... │ ├───────┼───────┼───────────┼─────────┼──────────────┼───────┤ │ ... │ ... │ ... │ ... │ ... │ ... │ └───────┴───────┴───────────┴─────────┴──────────────┴───────┘ │ │ ▼ ┌───────────────────────────────────────────────────────────────────┐ │ Config.csv │ ├───────┬───────────┬─────────┬──────────────┬───────┬──────────────┤ │Heading│ Heading │ Heading │ Heading │Heading│ Heading │ ├───────┼───────────┼─────────┼──────────────┼───────┼──────────────┤ ││ │ MERGE │ ││ https://... │ ├───────┼───────────┼─────────┼──────────────┼───────┼──────────────┤ ││ │ NEW │ ││ https://... │ └───────┴───────────┴─────────┴──────────────┴───────┴──────────────┘ │ │ ▼ ┌───────────────────────────────────────────────────────────────────┐ │ Output.csv │ ├───────┬───────────┬─────────┬──────────────┬───────┬──────────────┤ │Heading│ Heading │ Heading │ Heading │Heading│ Heading │ ├───────┼───────────┼─────────┼──────────────┼───────┼──────────────┤ │ ... │ ... │ MERGE │ ... │ ... │ https://... │ ├───────┼───────────┼─────────┼──────────────┼───────┼──────────────┤ │ ... │ ... │ NEW │ │ ... │ https://... │ ├───────┼───────────┼─────────┼──────────────┼───────┼──────────────┤ │ ... │ ... │ MERGE │ ... │ ... │ https://... │ ├───────┼───────────┼─────────┼──────────────┼───────┼──────────────┤ │ ... │ ... │ NEW │ │ ... │ https://... │ ├───────┼───────────┼─────────┼──────────────┼───────┼──────────────┤ │ ... │ ... │ MERGE │ ... │ ... │ https://... │ ├───────┼───────────┼─────────┼──────────────┼───────┼──────────────┤ │ ... │ ... │ NEW │ │ ... │ https://... │ ├───────┼───────────┼─────────┼──────────────┼───────┼──────────────┤ │ ... │ ... │ MERGE │ ... │ ... │ https://... │ ├───────┼───────────┼─────────┼──────────────┼───────┼──────────────┤ │ ... │ ... │ NEW │ │ ... │ https://... │ ├───────┼───────────┼─────────┼──────────────┼───────┼──────────────┤ │ ... │ ... │ MERGE │ ... │ ... │ https://... │ ├───────┼───────────┼─────────┼──────────────┼───────┼──────────────┤ │ ... │ ... │ NEW │ │ ... │ https://... │ └───────┴───────────┴─────────┴──────────────┴───────┴──────────────┘ ```

In this example we're splitting a single input line into two resulting in double the lines in our output file.

Config reference

The config file includes logic and filters that will make it easier for you to generate smarter outputs.

Filters

Filters allow you to make changes to the content of a cell.

Syntax: <cell[n] FILTER|'argument'|[number]>

(💡 You can combine filters simply by adding them: <cell1 TRIM APPEND|'!!!' UPPER_CASE> which will give us this: HELLO WORLD!!!)

For the below documentation we assume <cell1> has the value Hello World

UPPER_CASE

Convert the contents of a cell into upper case.

  • <cell1 UPPER_CASE> => HELLO WORLD

LOWER_CASE

Convert the contents of a cell into lower case.

  • <cell1 LOWER_CASE> => hello world

LENGTH

Convert the contents of a cell into the number of characters it contains.

  • <cell1 LENGTH> => 15

TRIM

Removes whitespace from both ends of the cell.

  • <cell1 TRIM> => Hello World

TRIM_START

Removes whitespace from the start of the cell.

  • <cell1 TRIM_START> => Hello World

TRIM_END

Removes whitespace from the end of the cell.

  • <cell1 TRIM_END> => Hello World

REPLACE|' '|'-'

Replaces something of the cell with something else.

  • <cell1 REPLACE|'World'|'Everyone'> => Hello Everyone

APPEND|'-end'

Adds something to the end of the cell.

  • <cell1 APPEND|'!!!'> => Hello World !!!

PREPEND|'pre-'

Adds something to the start of the cell.

  • <cell1 PREPEND|':)'> => :) Hello World

SPLIT|','|1

Splits the cell every time it finds the string you pass in and allows you to select which of the resulting bits you want to show.

  • <cell1 SPLIT|'o'|1> => W

SUB_STRING|10|5

Returns only a part of the cell by you defining the start and optionally the end. If the end is not given the rest of the cell will be returned.

  • <cell1 SUB_STRING|1> => Hello World
  • <cell1 SUB_STRING|1|5> => Hello World

Conditions

Conditions allow you to add logic to a cell.

Syntax: :IF <cell1> [condition] ('then-item') [ELSE ('else-item')]

  • The ELSE clause is optional
  • A then-item can be a String or a cell: :IF <cell1> [condition] ('then-item') or :IF <cell1> [condition] (<cell2>)
  • All cells inside a condition support all filters

(💡 If any of your conditions evaluate to SKIP_THIS_LINE then the entire line won't be exported in the output)

IS_EMPTY

Checks if the cell is empty.

  • :IF <cell1> IS_EMPTY (<cell2>)

IS_NOT_EMPTY

Checks if the cell is not empty.

  • :IF <cell1> IS_NOT_EMPTY (<cell2>)

IS_NUMERIC

Checks if the cell is a number.

  • :IF <cell1> IS_NUMERIC (<cell2>)

STARTS_WITH|'beginning'

Checks if the cell starts with a given string.

  • :IF <cell1> STARTS_WITH|'beginning' (<cell2>)

ENDS_WITH|'end'

Checks if the cell ends with a given string.

  • :IF <cell1> ENDS_WITH|'end' (<cell2>)

CONTAINS|'happiness'

Checks if the cell contains a given string.

  • :IF <cell1> CONTAINS|'happiness' (<cell2>)

== 'this item

Checks if the cell is equal to a given string.

  • :IF <cell1> == 'Same?' (<cell2>)

!= 'this item

Checks if the cell is not equal to a given string.

  • :IF <cell1> != 'Not the Same?' (<cell2>)

> 42

Checks if the cell is greater than a given number.

  • :IF <cell1> > 42 (<cell2>)

< 42

Checks if the cell is less than a given number.

  • :IF <cell1> <> 42 (<cell2>)

% 2 = 0

Checks if the cell, when divided by a given number, leaves a remainder equal to a given value.

  • :IF <cell1> % 2 = 0 (<cell2>)

CLI Usage

csv_converter [OPTIONS]

Options:
  -i <file>, --input <file>
        Specify the input file to process.
  -o <file>, --output <file>
        Specify the output file to write results to.
  -c <file>, --config <file>
        Specify the config file to determine what the output format is.
  -v, -V, --version
        Display the program's version information.
  -h, --help
        Display this help message.

Example command:

csv_converter -i input.csv -o output.csv -c config.csv

Installation

Prerequisites

  • Rust: Ensure you have Rust installed. You can download it from rust-lang.org.

Install via cargo

cargo install csv_converter

Build from Source

git clone https://github.com/the-working-party/csv_converter.git
cd csv_converter
cargo build --release
# Now run the app via "cargo run --release" instead of "csv_converter" or locate the binary in your target folder

Contributing

Contributions are welcome. Please open an issue or submit a pull request on the GitHub repository to contribute to this project.

Licensing

Copyleft (c) 2024 Licensed under MIT.

No runtime deps