对fasta文件genome_test.fa中的染色体序列进行反向互补,并输出到文件genome_test_RC.fa
genome_test.fa
>chr1
ATATATATAT
>chr2
ATATATATATCGCGCGCGCG
>chr3
ATATATATATCGCGCGCGCGATATATATAT
>chr4
ATATATATATCGCGCGCGCGATATATATATCGCGCGCGCG
>chr5
ATATATATATCGCGCGCGCGATATATATATCGCGCGCGCGATATATATAT
新疆Reverse_Complement.py文件,并输入如下python脚本
python脚本
1 # import os # 导入模块os 2 import sys # 导入模块sys 3 f_fasta = sys.argv[1] # 从命令行获取文件名 4 f = open(f_fasta) # 打开文件 5 f_RC = open("genome_test_RC.fa","w+") 6 # 逐行读取 7 lines = f.readlines() 8 for line in lines: 9 line = line.strip() # 去掉行尾的换行符 10 if (line.startswith(">")): 11 chr_id = line + '_RC' 12 else: 13 chr_seq = line[::-1].replace('A','t').replace('T','a').replace('C','g').replace('G','c').upper() 14 # 输出结果 15 print(chr_id) 16 print(chr_seq) 17 18 f_RC.write(chr_id + '\n') 19 f_RC.write(chr_seq + '\n') 20 f.close() 21 f_RC.close()
从cmd终端命令行输入参数,调用上述python脚本,并对genome_test.fa进行处理
1 E:\15_python\DEBUG>python Reverse_Complement.py genome_test.fa
结果
genome_test_RC.fa
>chr1_RC
ATATATATAT
>chr2_RC
CGCGCGCGCGATATATATAT
>chr3_RC
ATATATATATCGCGCGCGCGATATATATAT
>chr4_RC
CGCGCGCGCGATATATATATCGCGCGCGCGATATATATAT
>chr5_RC
ATATATATATCGCGCGCGCGATATATATATCGCGCGCGCGATATATATAT