#fasta #gff3 #ratio #statistics #dna

app stats_on_gff3

Calculate statistics such as CDS GC3 ratio, intron GC ratio, flanking gene region GC ratio, first intron length, number of introns, CpG ratio, etc. Examples: stats_on_gff3 Homo_sapiens.GRCh38.109.chromosome.1.gff3 Homo_sapiens.GRCh38.dna.chromosome.1.fa zcat Ciona_savignyi.CSAV2.0.dna.toplevel.fa.gz | stats_on_gff3 Ciona_savignyi.CSAV2.0.109.gff3 stdin See https://gitlab.in2p3.fr/penel/stats_on_gff3

26 releases

0.1.26 Feb 22, 2024
0.1.25 Feb 22, 2024
0.1.18 Aug 1, 2023
0.1.17 Jul 28, 2023
0.1.4 Apr 28, 2023

#56 in Biology

CECILL-2.1

61KB
1K SLoC

Stats of gff3 files

Install :

cargo install stats_on_gff3

Crates: https://crates.io/crates/stats_on_gff3

Examples:

stats_on_gff3 --precision 1000 Homo_sapiens.GRCh38.109.chromosome.1.gff3 Homo_sapiens.GRCh38.dna.chromosome.1.fa 2>err

stats_on_gff3 --all Homo_sapiens.GRCh38.109.chromosome.1.gff3 Homo_sapiens.GRCh38.dna.chromosome.1.fa 2>err

zcat Ciona_savignyi.CSAV2.0.dna.toplevel.fa.gz | stats_on_gff3 --precision 100 Ciona_savignyi.CSAV2.0.109.gff3 stdin 2> err

Input data:

A Gff file and its associated fasta file from Ensembl. Fasta sequences should be in uppercase. (for NCBI data, see stats_on_gff3_ncbi https://crates.io/crates/stats_on_gff3_ncbi)

Dependencies

~19MB
~321K SLoC