#bioinformatics-sequence #bioinformatics #sequence

bin+lib seq-here

A fast tool for bio-sequence file processing

6 releases

Uses new Rust 2024

new 0.1.0 Apr 13, 2025
0.0.5 Apr 5, 2025
0.0.4 Mar 17, 2025

#126 in Biology

Download history 131/week @ 2025-02-25 390/week @ 2025-03-04 195/week @ 2025-03-11 48/week @ 2025-03-18 3/week @ 2025-03-25 106/week @ 2025-04-01 85/week @ 2025-04-08

322 downloads per month

MIT license

55KB
942 lines

seq-here


Version GitHub Build Status Crates.io Documentation License

Introduction

Seq-here is a fast tool for bio-sequence file processing.

Installation

You can install seq-here using cargo:

cargo install seq-here

or you can build it from source:

git clone git@github.com:bio-here/seq-here.git
cd seq-here
cargo build --release
cp target/release/seq-here /usr/local/bin

seq-here --version

Lib Crate

You can also use seq-here as a library crate in your project, by adding the following to your Cargo.toml:

[dependencies]
seq-here = "0.1.0"

Usage

To see detailed usage information, you can run:

seq-here --help
  • Info: Get basic information about the input sequence file(s).
# Fasta file information
seq-here info fa you_files.fasta,your_files2.fasta

# Fastq file information
seq-here info fq your_files.fastq

# Gff/Gtf file information, Gff2 not supported yet
seq-here info gff your_files.gff

# -o, --output: output method, default is println
# 3 options: println, file, csv
# The file will be put in the current directory
seq-here info fa your_files.fasta -o file

# input a directory to get all files information below the directory
seq-here info fa your_dir
  • Process: Convert or process incoming sequence file(s).
# Combine files
seq-here process combine files_folder
seq-here porcess combine file1,file2,file3

# -o, --output <OutputFile>
#         Output file name, if value is a directory, it would use default file_name in the directory.


seq-here process combine files_folder -o ./output/all.txt
  • Extract: Extract specified sequence segment or file data.
# Extract a sequence segment by id
seq-here extract segment input.fasta --file sequence_id.txt
seq-here extract segment input.fasta --str GhID00000001

# Extract a specific portion of a sequence by position (0-based coordinates)
seq-here extract segment input.fasta --str GhID00000001 --start 100 --end 200
seq-here extract segment input.fasta --file ids.txt --start 50 --end 150

# Extract sequences by given annotation file
seq-here extract explain --seq input.fasta --gff input.anno.gff -o output_path.fasta

# Extract only specific feature types from annotations
seq-here extract explain --seq input.fasta --gff input.anno.gff --type CDS,gene,mRNA -o output_path

Contributing

Please read CONTRIBUTING.md for details on our code of conduct, and the process for submitting pull requests to us.

License

This project is licensed under the MIT License - see the LICENSE file for details.

Dependencies

~20–29MB
~431K SLoC