#bioinformatics #fastq #bam #fuzzy-matching #vcf #genomics

app rust-bio-tools

A set of fast and robust command line utilities for bioinformatics tasks based on Rust-Bio

86 releases

0.42.2 Apr 10, 2024
0.42.0 Jan 18, 2023
0.41.0 Aug 23, 2022
0.40.0 Jul 21, 2022
0.2.4 Mar 6, 2018

#5 in Biology

Download history 27/week @ 2024-09-18 28/week @ 2024-09-25 1/week @ 2024-10-02 2/week @ 2024-10-09 151/week @ 2024-11-27 765/week @ 2024-12-04 749/week @ 2024-12-11 14/week @ 2024-12-18 5/week @ 2025-01-01

931 downloads per month

Custom license and GPL-3.0+

3MB
8K SLoC

Rust 7K SLoC // 0.0% comments Tera 870 SLoC // 0.0% comments JavaScript 330 SLoC // 0.1% comments

Gitpod Ready-to-Code Bioconda downloads Bioconda version install with bioconda Licence GitHub Workflow Status

Rust-Bio-Tools

A set of ultra fast and robust command line utilities for bioinformatics tasks based on Rust-Bio. Rust-Bio-Tools provides a command rbt, which currently supports the following operations:

  • a linear time implementation for fuzzy matching of two vcf/bcf files (rbt vcf-match)
  • a vcf/bcf to txt converter, that flexibly allows to select tags and properly handles multiallelic sites (rbt vcf-to-txt)
  • a linear time round-robin FASTQ splitter that splits a given FASTQ files into a given number of chunks (rbt fastq-split)
  • a linear time extraction of depth information from BAMs at given loci (rbt bam-depth)
  • a utility to quickly filter records from a FASTQ file (rbt fastq-filter)
  • a tool to merge BAM or FASTQ reads using marked duplicates respectively unique molecular identifiers (UMIs) (rbt collapse-reads-to-fragments bam|fastq)
  • a tool to generate interactive HTML based reports that offer multiple plots visualizing the provided genomics data in VCF and BAM format (rbt vcf-report)
  • a tool to generate an interactive HTML based report from a csv file including visualizations (rbt csv-report)
  • a tool for splitting VCF/BCF files into N equal chunks, including BND support (rbt vcf-split)
  • a tool to generate visualizations for a specific region of one or multiple BAM files with a given reference contained in a single HTML file (rbt plot-bam)

Further functionality is added as it is needed by the authors. Check out the Contributing section if you want contribute anything yourself. For a list of changes, take a look at the CHANGELOG.

Installation

Requirements

Rust-Bio-Tools depends rgsl which needs GSL to be installed:

  • Ubuntu: sudo apt-get install libgsl-dev
  • Arch: sudo pacman -S gsl
  • OSX: brew install gsl

Bioconda

Rust-Bio-Tools is available via Bioconda. With Bioconda set up, installation is as easy as

conda install rust-bio-tools

Cargo

If the Rust compiler and associated Cargo are installed, Rust-Bio-Tools may be installed via

cargo install rust-bio-tools

Source

Download the source code and within the root directory of source run

cargo install

Usage and Documentation

Rust-Bio-Tools installs a command line utility rbt. Issue

rbt --help

for a summary of all options and tools.

Contributing

Any contributions are highly welcome. If you plan to contribute we suggest installing pre-commit hooks. To do so:

  1. Install pre-commit as explained here
  2. Run pre-commit install in the rust-bio-tools base directory

This should format, check and lint your code when committing.

Authors

Dependencies

~92MB
~2M SLoC