【Python小试】根据外显子位置生成CDS序列

已知

genomic_dna.txt

TCGATCGTACCGTCGACGATGCTACGATCGTCGATCGTAGTCGATCATCGATCGATCGACTGATCGATCGATCGATCGATCGATATCGATCGATATCATCGATGCATCGATCATCGATCGATCGATCGATCGATCGATCATATGTCAGTCGATGCATCGTAGCATCGTATAGTAGCTACGTAGCTACGATCGATCGATCGATCGTAGCTAGCTAGCTAGATCGATCATCATCGTAGCTAGCTCGACTAGCTACGTACGATCGATGCATCGATCGTAGCTAGTACGATCGCGTAGCTAGCATGCTACGTAGATCGATCGATGCATGCTAGCTAGCTAGCTACGATCGATCGATCGATCGATCGATCGATCGATCGATCGATCGTAGCTAGCTACGATCGATGCTACGTAGATCGATCGCTAGTAGATCGATCGCTAGCTAGCTGACTAGTACGCTGCTAGTAGTCAGCTAGATCGATGCTAGTCA

exons.txt

5,58
72,133
190,276
340,398

编码

genomic_dna = open("genomic_dna.txt").read()
exon_locations = open("exons.txt")
output = open("coding_sequence.txt", "w")

coding_sequence = ""
for line in exon_locations:
    positions = line.split(',')
    start = int(positions[0])
    stop = int(positions[1])
    exon = genomic_dna[start:stop]
    coding_sequence = coding_sequence + exon

output.write(coding_sequence)
output.close()

结果

coding_sequence.txt

CGTACCGTCGACGATGCTACGATCGTCGATCGTAGTCGATCATCGATCGATCGCGATCGATCGATATCGATCGATATCATCGATGCATCGATCATCGATCGATCGATCGATCGACGATCGATCGATCGTAGCTAGCTAGCTAGATCGATCATCATCGTAGCTAGCTCGACTAGCTACGTACGATCGATGCATCGATCGTACGATCGATCGATCGATCGATCGATCGATCGATCGATCGATCGTAGCTAGCTACGATCG
posted @ 2020-04-20 23:21  生物信息与育种  阅读(612)  评论(0编辑  收藏  举报