【BioCode】将多个蛋白质序列分成单个的txt文档

代码说明:

fasta格式的蛋白质序列,一个txt里面有很多蛋白质序列,计算ss、pssm或disorder score时候都需要单条计算,需要分开。

分割前:

分割后:

show you the code:

package single;

import java.io.BufferedReader;
import java.io.FileNotFoundException;
import java.io.FileReader;
import java.io.*;
import java.io.IOException;
//将整个文件分成单个的TXT文件
public class Single {
    public static void getTxt(String path) throws IOException {
        try {
            FileReader reader = new FileReader(path);
            BufferedReader br = new BufferedReader(reader);
            String str = null;
            String str1 = null;
            int count = 0;
            while ((str = br.readLine()) != null) {
                System.out.println(str);
                str1 = br.readLine();
                count++;
                //E:\experiment----N-formylated\single
                FileWriter fileWritter = new FileWriter("E:\\experiment--help\\linglingbao\\new-single\\" + count + ".txt");//使用数字对每个txt编号
                BufferedWriter bufferWritter = new BufferedWriter(fileWritter);
                bufferWritter.write(str+"\n");
                bufferWritter.write(str1);
                bufferWritter.flush();
            }
System.out.println(count);
            br.close();
            reader.close();
        } catch (FileNotFoundException e) {
            // TODO Auto-generated catch block
            e.printStackTrace();
        }
    }

    public static void main(String[] args) {
       
        String path = "E:\\experiment--help\\linglingbao\\new-single\\seq.txt";
        try {
            getTxt(path);
        } catch (IOException e) {
            // TODO Auto-generated catch block
            e.printStackTrace();
        }
    }
}

 

posted @ 2017-07-04 19:49  于淼  阅读(692)  评论(5编辑  收藏  举报