fakit: 一个处理fasta序列的小工具。
断断续续的学了些rust语法,就想着写些简单的东西,以熟悉语法为主。这次主要针对fasta文件的简单处理写了fakit,参数也不多可以利用管道组合使用,主要是复杂的功能我不会,哈哈。
github:
https://github.com/sharkLoc/fakit
install
git clone https://github.com/sharkLoc/fakit.git
cd fakit
cargo b --release
# mv target/release/fakit to anywhere you want
usage
fakit -h
fqkit: a simple program for fasta file manipulation
Usage: fakit [OPTIONS] [INPUT]
Arguments:
[INPUT] input fasta[.gz] file
Options:
-u, --upper convert base to uppercase
-l, --lower convert base to lowercase
-w, --length <LEN> base number of each line, 0 for long single line
-f, --fake <FAKE> fasta to fastq and generate fake fastq quality
-d, --drop <DROP> drop sequences with length shorter than int
-c, --convert <CONV> r for reverse seq, m for match seq
-s, --summary simple statistics of fasta file
-h, --help Print help information
-V, --version Print version information
exeample
test.fa
>s1
GAGATCGGAGAAGATAGTTTTAGGGTTTGAGATTGAGAAGAAGATGAAGAAAATTTATGA
>s2
gactnacntacnncGCACAAACAGGACgatgatgttgatCCGTGTGTGTACGTGAGTTGG
>s3
GAGAGACTCTTCGTAAGACAGTAAGATTGTGAAAGTCA
fakit -u test.ta
>s1
GAGATCGGAGAAGATAGTTTTAGGGTTTGAGATTGAGAAGAAGATGAAGAAAATTTATGA
>s2
GACTNACNTACNNCGCACAAACAGGACGATGATGTTGATCCGTGTGTGTACGTGAGTTGG
>s3
GAGAGACTCTTCGTAAGACAGTAAGATTGTGAAAGTCA
fakit -u test.ta |fakit -w 30
>s1
GAGATCGGAGAAGATAGTTTTAGGGTTTGA
GATTGAGAAGAAGATGAAGAAAATTTATGA
>s2
GACTNACNTACNNCGCACAAACAGGACGAT
GATGTTGATCCGTGTGTGTACGTGAGTTGG
>s3
GAGAGACTCTTCGTAAGACAGTAAGATTGT
GAAAGTCA
fakit -u test.ta |fakit -w 0 |fakit -l
>s1
gagatcggagaagatagttttagggtttgagattgagaagaagatgaagaaaatttatga
>s2
gactnacntacnncgcacaaacaggacgatgatgttgatccgtgtgtgtacgtgagttgg
>s3
gagagactcttcgtaagacagtaagattgtgaaagtca
fakit -u test.ta |fakit -d 50
>s1
GAGATCGGAGAAGATAGTTTTAGGGTTTGAGATTGAGAAGAAGATGAAGAAAATTTATGA
>s2
GACTNACNTACNNCGCACAAACAGGACGATGATGTTGATCCGTGTGTGTACGTGAGTTGG
fakit -c r test.ta
>s1
AGTATTTAAAAGAAGTAGAAGAAGAGTTAGAGTTTGGGATTTTGATAGAAGAGGCTAGAG
>s2
GGTTGAGTGCATGTGTGTGCCtagttgtagtagCAGGACAAACACGcnncatncantcag
>s3
ACTGAAAGTGTTAGAATGACAGAATGCTTCTCAGAGAG
fakit -s test.ta
id base_A base_T base_G base_C base_N GC_Rate seq_Len
s1 24 16 19 1 0 0.33 60
s2 14 14 17 15 0 0.53 60
s3 14 9 10 5 0 0.39 38
fakit -f E test.ta
@s1
GAGATCGGAGAAGATAGTTTTAGGGTTTGAGATTGAGAAGAAGATGAAGAAAATTTATGA
+
EEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEE
@s2
gactnacntacnncGCACAAACAGGACgatgatgttgatCCGTGTGTGTACGTGAGTTGG
+
EEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEE
@s3
GAGAGACTCTTCGTAAGACAGTAAGATTGTGAAAGTCA
+
EEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEE
作者:天使不设防
本文版权归作者和博客园共有,欢迎转载,但未经作者同意必须保留此段声明,且在文章页面明显位置给出原文连接,否则保留追究法律责任的权利.
【推荐】国内首个AI IDE,深度理解中文开发场景,立即下载体验Trae
【推荐】编程新体验,更懂你的AI,立即体验豆包MarsCode编程助手
【推荐】抖音旗下AI助手豆包,你的智能百科全书,全免费不限次数
【推荐】轻量又高性能的 SSH 工具 IShell:AI 加持,快人一步
· 阿里最新开源QwQ-32B,效果媲美deepseek-r1满血版,部署成本又又又降低了!
· 单线程的Redis速度为什么快?
· SQL Server 2025 AI相关能力初探
· AI编程工具终极对决:字节Trae VS Cursor,谁才是开发者新宠?
· 展开说说关于C#中ORM框架的用法!