python 中输出指定长度的fasta 序列
001、
(base) root@PC1:/home/test2# ls a.fasta test.py (base) root@PC1:/home/test2# cat a.fasta ## 测试fasta文件 >OR4F5_ENSG00000186092_ENST00000641515_61_1038_2618 CCCAGATCTCTTCAGTTTTTATGCCTCATTCTGTGAAAATTGCTGTAGTCTCTTCCAGTTATGAAGAAGGTAACTGCAGAGGCTATTTCCTGGAATGAATCAACGAGTGAAACGAATAAC TCTATGGTGACTGAATTCATTTTTCTGGGTCTCTCTGATTCTCAGGAACTCCAGACCTTCCTATTTATGTTGTTTTTTGTATTCTATGGAGGAATCGTGTTTGGAAACCTTCTTATTGTC ATAACAGTGGTATCTGACTCCCACCTTCACTCTCCCATGTACTTCCTGCTAGCCA TAAGTGAATTCAAGACATAACTCTTTTTTCAAAAAAAC >OR4F29_ENSG00000284733_ENST00000426406_20_955_995 AGCCCAGTTGGCTGGACCAATGGATGGAGAGAATCACTCAGTGGTATCTGAGTTTTTGTTTCTGGGACTCACTCATTCATGGGAGATCCAGCTCCTCCTCCTAGTGTTTTCCTCTGTGCT CTATGTGGCAAGCATTACTGGAAACATCCTCATTGTGTTTTCTGTGACCACTGACCCTCATGGAGGCTGCATCGCTCAAATCTTCTTCATCCACGTCGTTGGTGGTGTGGAGATGGTGCT TGATGCAGTTCTCACTCCTTTTCTGAATCCAGTTGTCTATACATTCAGGAATAAGGAGATGAAGGCAGCAATAAAGAGAGTATGCAAACAGCTAGTGATTTACAAGAGGATCTCATAAAT GATATAATAAGCCCTTCTCATTAAACATGATATGG >OR4F16_ENSG00000284662_ENST00000332831_20_955_995 AGCCCAGTTGGCTGGACCAATGGATGGAGAGAATCACTCAGTGGTATCTGAGTTTTTGTTTCTGGGACTCACTCATTCATGGGAGATCCAGCTCCTCCTCCTAGTGTTTTCCTCTGTGCT GCTCATAGCCATGGCCTTTGACAGATATGTGGCCCTATGTAAGCCCCTCCACTATCTGACGGACAGCTTCTACTGTGACCTTCCTCGGCTTCTCAGACTAGCCTGTACCGACACCTACAG ATTGCAGTTCATGGTCACTGTTAACAGTGGGTTTATCTGTGTGGGTACTTTCTTCATACTTCTAATCTCCTACGTCTTCATCCTGTTTACTGTTTGGAAACATTCCTCAGGTGGTTCATC GATATAATAAGCCCTTCTCATTAAACATGATATGG (base) root@PC1:/home/test2# cat test.py ## 测试脚本 #/usr/bin/python in_file = open("a.fasta", "r") out_file = open("result.txt", "w") dict1 = {} for i in in_file: i = i.strip() if i[0] == ">": key = i dict1[key] = "" else: dict1[key] += i line_length = 100 ## 此处指定序列长度 for i,j in dict1.items(): out_file.write(i + "\n") while len(j) > line_length: out_file.write(j[:line_length] + "\n") j = j[line_length:] if len(j) > 0: out_file.write(j + "\n") in_file.close() out_file.close() (base) root@PC1:/home/test2# python test.py ## 执行程序 (base) root@PC1:/home/test2# ls a.fasta result.txt test.py (base) root@PC1:/home/test2# cat result.txt ## 结果文件 >OR4F5_ENSG00000186092_ENST00000641515_61_1038_2618 CCCAGATCTCTTCAGTTTTTATGCCTCATTCTGTGAAAATTGCTGTAGTCTCTTCCAGTTATGAAGAAGGTAACTGCAGAGGCTATTTCCTGGAATGAAT CAACGAGTGAAACGAATAACTCTATGGTGACTGAATTCATTTTTCTGGGTCTCTCTGATTCTCAGGAACTCCAGACCTTCCTATTTATGTTGTTTTTTGT ATTCTATGGAGGAATCGTGTTTGGAAACCTTCTTATTGTCATAACAGTGGTATCTGACTCCCACCTTCACTCTCCCATGTACTTCCTGCTAGCCATAAGT GAATTCAAGACATAACTCTTTTTTCAAAAAAAC >OR4F29_ENSG00000284733_ENST00000426406_20_955_995 AGCCCAGTTGGCTGGACCAATGGATGGAGAGAATCACTCAGTGGTATCTGAGTTTTTGTTTCTGGGACTCACTCATTCATGGGAGATCCAGCTCCTCCTC CTAGTGTTTTCCTCTGTGCTCTATGTGGCAAGCATTACTGGAAACATCCTCATTGTGTTTTCTGTGACCACTGACCCTCATGGAGGCTGCATCGCTCAAA TCTTCTTCATCCACGTCGTTGGTGGTGTGGAGATGGTGCTTGATGCAGTTCTCACTCCTTTTCTGAATCCAGTTGTCTATACATTCAGGAATAAGGAGAT GAAGGCAGCAATAAAGAGAGTATGCAAACAGCTAGTGATTTACAAGAGGATCTCATAAATGATATAATAAGCCCTTCTCATTAAACATGATATGG >OR4F16_ENSG00000284662_ENST00000332831_20_955_995 AGCCCAGTTGGCTGGACCAATGGATGGAGAGAATCACTCAGTGGTATCTGAGTTTTTGTTTCTGGGACTCACTCATTCATGGGAGATCCAGCTCCTCCTC CTAGTGTTTTCCTCTGTGCTGCTCATAGCCATGGCCTTTGACAGATATGTGGCCCTATGTAAGCCCCTCCACTATCTGACGGACAGCTTCTACTGTGACC TTCCTCGGCTTCTCAGACTAGCCTGTACCGACACCTACAGATTGCAGTTCATGGTCACTGTTAACAGTGGGTTTATCTGTGTGGGTACTTTCTTCATACT TCTAATCTCCTACGTCTTCATCCTGTTTACTGTTTGGAAACATTCCTCAGGTGGTTCATCGATATAATAAGCCCTTCTCATTAAACATGATATGG