python 学习之 fasta/fastq 处理利器--pyfastx
001、 fasta序列迭代
(base) root@PC1:/home/test2# cat a.fasta ## 测试fasta文件 >gene1 myc AGCTGCCTAAGC GGCATAGCTAATCG >gene2 jun ACCGAATCGGAGCGATG GGCATTAAAGATCTAGCT >gene3 malat1 AGGCTAGCGAG GCGCGAG GATTAGGCG >>> import pyfastx ## 导入包 >>> fa = pyfastx.Fastx('a.fasta') ## 读取fasta文件 >>> type(fa) <class 'Fastx'> >>> for i,j,k in fa: ## 迭代, i默认那么; j序列; k注释。 ... print(i) ... print(j) ... print(k) ... gene1 AGCTGCCTAAGCGGCATAGCTAATCG myc gene2 ACCGAATCGGAGCGATGGGCATTAAAGATCTAGCT jun gene3 AGGCTAGCGAGGCGCGAGGATTAGGCG malat1
002、如果含有小写字母,指定输出为大写字母
(base) root@PC1:/home/test2# cat a.fasta ## 测试fasta文件 >JZ822577.1 contig1 cDNA library of flower petals in tree peony by suppression subtractive hybridization Paeonia suffruticosa cDNA, mRNA sequence CTctagcttaaaTTACTTCTTCACATTCCAGATCACTCAGGCTCTTTGTCATTTTAGTTTGACTAGGATATCGAGTATTCAAGCTCATCGCTTTTGGTAATCTTTGCGGTGCATGCCTTTGCATGCTGTATTGCTGCTTCATCATCCCCTTTGACTTGTGTGGCGGTGGCAAGACATCCGAAGAGTTAAGCGATGCTTGTCTAGTCAATTTCCCCATGTACAGAATCATTGTTGTCAATTGGTTGTTTCCTTGATGGTGAAGGGGCTTCAATACATGAGTTCCAAACTAACATTTCTTGACTAACACTTGAGGAAGAAGGACAAGGGTCCCCATGT >>> for item in pyfastx.Fastx('a.fasta', uppercase=True): ## 读取数据, 全部以大写输出 ... print(item) ... ('JZ822577.1', 'CTCTAGCTTAAATTACTTCTTCACATTCCAGATCACTCAGGCTCTTTGTCATTTTAGTTTGACTAGGATATCGAGTATTCAAGCTCATCGCTTTTGGTAATCTTTGCGGTGCATGCCTTTGCATGCTGTATTGCTGCTTCATCATCCCCTTTGACTTGTGTGGCGGTGGCAAGACATCCGAAGAGTTAAGCGATGCTTGTCTAGTCAATTTCCCCATGTACAGAATCATTGTTGTCAATTGGTTGTTTCCTTGATGGTGAAGGGGCTTCAATACATGAGTTCCAAACTAACATTTCTTGACTAACACTTGAGGAAGAAGGACAAGGGTCCCCATGT', 'contig1 cDNA library of flower petals in tree peony by suppression subtractive hybridization Paeonia suffruticosa cDNA, mRNA sequence')
003、fastq序列迭代
(base) root@PC1:/home/test2# cat b.fastq ## 测试fastq文件 @WT_rep1_BAF155.1 SALLY:291:C149WACXX:2:1101:2579:1951 length=51 CTGNCCAAGGTAATTTATAGATTCAATGCCATCCCCATCAAGCTACCAANG +WT_rep1_BAF155.1 SALLY:291:C149WACXX:2:1101:2579:1951 length=51 BCC#4ADDHHBFHIJJIIJJJIIIIJHIJIJIIJGGIJJJJIGJJJJJJ##
>>> fq = pyfastx.Fastx('b.fastq') ## 读取数据 >>> for i,j,k,l in fq: ## 迭代 ... print(i) ... print(j) ... print(k) ... print(l) ... WT_rep1_BAF155.1 CTGNCCAAGGTAATTTATAGATTCAATGCCATCCCCATCAAGCTACCAANG BCC#4ADDHHBFHIJJIIJJJIIIIJHIJIJIIJGGIJJJJIGJJJJJJ## SALLY:291:C149WACXX:2:1101:2579:1951 length=51
【推荐】国内首个AI IDE,深度理解中文开发场景,立即下载体验Trae
【推荐】编程新体验,更懂你的AI,立即体验豆包MarsCode编程助手
【推荐】抖音旗下AI助手豆包,你的智能百科全书,全免费不限次数
【推荐】轻量又高性能的 SSH 工具 IShell:AI 加持,快人一步
· 震惊!C++程序真的从main开始吗?99%的程序员都答错了
· 【硬核科普】Trae如何「偷看」你的代码?零基础破解AI编程运行原理
· 单元测试从入门到精通
· 上周热点回顾(3.3-3.9)
· winform 绘制太阳,地球,月球 运作规律
2021-08-12 c primer plus 5编程练习