python 中输出指定长度的fasta 序列

 

001、

(base) root@PC1:/home/test2# ls
a.fasta  test.py
(base) root@PC1:/home/test2# cat a.fasta                 ## 测试fasta文件
>OR4F5_ENSG00000186092_ENST00000641515_61_1038_2618
CCCAGATCTCTTCAGTTTTTATGCCTCATTCTGTGAAAATTGCTGTAGTCTCTTCCAGTTATGAAGAAGGTAACTGCAGAGGCTATTTCCTGGAATGAATCAACGAGTGAAACGAATAAC
TCTATGGTGACTGAATTCATTTTTCTGGGTCTCTCTGATTCTCAGGAACTCCAGACCTTCCTATTTATGTTGTTTTTTGTATTCTATGGAGGAATCGTGTTTGGAAACCTTCTTATTGTC
ATAACAGTGGTATCTGACTCCCACCTTCACTCTCCCATGTACTTCCTGCTAGCCA
TAAGTGAATTCAAGACATAACTCTTTTTTCAAAAAAAC
>OR4F29_ENSG00000284733_ENST00000426406_20_955_995
AGCCCAGTTGGCTGGACCAATGGATGGAGAGAATCACTCAGTGGTATCTGAGTTTTTGTTTCTGGGACTCACTCATTCATGGGAGATCCAGCTCCTCCTCCTAGTGTTTTCCTCTGTGCT
CTATGTGGCAAGCATTACTGGAAACATCCTCATTGTGTTTTCTGTGACCACTGACCCTCATGGAGGCTGCATCGCTCAAATCTTCTTCATCCACGTCGTTGGTGGTGTGGAGATGGTGCT
TGATGCAGTTCTCACTCCTTTTCTGAATCCAGTTGTCTATACATTCAGGAATAAGGAGATGAAGGCAGCAATAAAGAGAGTATGCAAACAGCTAGTGATTTACAAGAGGATCTCATAAAT
GATATAATAAGCCCTTCTCATTAAACATGATATGG
>OR4F16_ENSG00000284662_ENST00000332831_20_955_995
AGCCCAGTTGGCTGGACCAATGGATGGAGAGAATCACTCAGTGGTATCTGAGTTTTTGTTTCTGGGACTCACTCATTCATGGGAGATCCAGCTCCTCCTCCTAGTGTTTTCCTCTGTGCT
GCTCATAGCCATGGCCTTTGACAGATATGTGGCCCTATGTAAGCCCCTCCACTATCTGACGGACAGCTTCTACTGTGACCTTCCTCGGCTTCTCAGACTAGCCTGTACCGACACCTACAG
ATTGCAGTTCATGGTCACTGTTAACAGTGGGTTTATCTGTGTGGGTACTTTCTTCATACTTCTAATCTCCTACGTCTTCATCCTGTTTACTGTTTGGAAACATTCCTCAGGTGGTTCATC
GATATAATAAGCCCTTCTCATTAAACATGATATGG
(base) root@PC1:/home/test2# cat test.py              ## 测试脚本
#/usr/bin/python
in_file = open("a.fasta", "r")
out_file = open("result.txt", "w")
dict1 = {}

for i in in_file:
    i = i.strip()
    if i[0] == ">":
        key = i
        dict1[key] = ""
    else:
        dict1[key] += i

line_length = 100                         ## 此处指定序列长度
for i,j in dict1.items():
    out_file.write(i + "\n")
    while len(j) > line_length:
        out_file.write(j[:line_length] + "\n")
        j = j[line_length:]
    if len(j) > 0:
        out_file.write(j + "\n")

in_file.close()
out_file.close()
(base) root@PC1:/home/test2# python test.py          ## 执行程序
(base) root@PC1:/home/test2# ls
a.fasta  result.txt  test.py
(base) root@PC1:/home/test2# cat result.txt          ## 结果文件
>OR4F5_ENSG00000186092_ENST00000641515_61_1038_2618
CCCAGATCTCTTCAGTTTTTATGCCTCATTCTGTGAAAATTGCTGTAGTCTCTTCCAGTTATGAAGAAGGTAACTGCAGAGGCTATTTCCTGGAATGAAT
CAACGAGTGAAACGAATAACTCTATGGTGACTGAATTCATTTTTCTGGGTCTCTCTGATTCTCAGGAACTCCAGACCTTCCTATTTATGTTGTTTTTTGT
ATTCTATGGAGGAATCGTGTTTGGAAACCTTCTTATTGTCATAACAGTGGTATCTGACTCCCACCTTCACTCTCCCATGTACTTCCTGCTAGCCATAAGT
GAATTCAAGACATAACTCTTTTTTCAAAAAAAC
>OR4F29_ENSG00000284733_ENST00000426406_20_955_995
AGCCCAGTTGGCTGGACCAATGGATGGAGAGAATCACTCAGTGGTATCTGAGTTTTTGTTTCTGGGACTCACTCATTCATGGGAGATCCAGCTCCTCCTC
CTAGTGTTTTCCTCTGTGCTCTATGTGGCAAGCATTACTGGAAACATCCTCATTGTGTTTTCTGTGACCACTGACCCTCATGGAGGCTGCATCGCTCAAA
TCTTCTTCATCCACGTCGTTGGTGGTGTGGAGATGGTGCTTGATGCAGTTCTCACTCCTTTTCTGAATCCAGTTGTCTATACATTCAGGAATAAGGAGAT
GAAGGCAGCAATAAAGAGAGTATGCAAACAGCTAGTGATTTACAAGAGGATCTCATAAATGATATAATAAGCCCTTCTCATTAAACATGATATGG
>OR4F16_ENSG00000284662_ENST00000332831_20_955_995
AGCCCAGTTGGCTGGACCAATGGATGGAGAGAATCACTCAGTGGTATCTGAGTTTTTGTTTCTGGGACTCACTCATTCATGGGAGATCCAGCTCCTCCTC
CTAGTGTTTTCCTCTGTGCTGCTCATAGCCATGGCCTTTGACAGATATGTGGCCCTATGTAAGCCCCTCCACTATCTGACGGACAGCTTCTACTGTGACC
TTCCTCGGCTTCTCAGACTAGCCTGTACCGACACCTACAGATTGCAGTTCATGGTCACTGTTAACAGTGGGTTTATCTGTGTGGGTACTTTCTTCATACT
TCTAATCTCCTACGTCTTCATCCTGTTTACTGTTTGGAAACATTCCTCAGGTGGTTCATCGATATAATAAGCCCTTCTCATTAAACATGATATGG

 

posted @ 2022-08-12 13:30  小鲨鱼2018  阅读(88)  评论(0编辑  收藏  举报