孟灵己  
  • H3K27ac
mkdir named_H3K27ac
mkdir named_H3K27ac_s1
mkdir named_H3K27ac_s2
mkdir named_H3K27ac_s3
mkdir named_H3K27ac_s4
(base) [xxzhang@cu08 human_histone_mark]$ ls -Q ./named_H3K27ac  |head -500 |xargs -i mv ./named_H3K27ac/{} ./named_H3K27ac_s1/
ls: write error: Broken pipe
(base) [xxzhang@cu08 human_histone_mark]$ ls -Q ./named_H3K27ac  |head -500 |xargs -i mv ./named_H3K27ac/{} ./named_H3K27ac_s2/
ls: write error: Broken pipe
(base) [xxzhang@cu08 human_histone_mark]$ ls -Q ./named_H3K27ac  |head -500 |xargs -i mv ./named_H3K27ac/{} ./named_H3K27ac_s3/
ls: write error: Broken pipe
(base) [xxzhang@cu08 human_histone_mark]$ ls -Q ./named_H3K27ac  |head -500 |xargs -i mv ./named_H3K27ac/{} ./named_H3K27ac_s4/
ls: write error: Broken pipe
(base) [xxzhang@cu08 human_histone_mark]$ mv named_H3K27ac named_H3K27ac_s5

然后可能会让人觉得有些麻烦,但是我仍然觉得可以尝试。
建立新的文件夹,将这些文件全部移动到新的文件夹H3K27ac中。
现在开始建立索引。

(base) [xxzhang@cu08 H3K27ac]$ giggle index -i "./named_H3K27ac_s1/*" -o ./named_H3K27ac_s1_index -s -f
Indexed 22372359 intervals.
(base) [xxzhang@cu08 H3K27ac]$ giggle index -i "./named_H3K27ac_s2/*" -o ./named_H3K27ac_s2_index -s -f
Indexed 20255859 intervals.
(base) [xxzhang@cu08 H3K27ac]$ giggle index -i "./named_H3K27ac_s3/*" -o ./named_H3K27ac_s3_index -s -f
Indexed 16521059 intervals.
(base) [xxzhang@cu08 H3K27ac]$ giggle index -i "./named_H3K27ac_s4/*" -o ./named_H3K27ac_s4_index -s -f
Indexed 21998978 intervals.
(base) [xxzhang@cu08 H3K27ac]$ giggle index -i "./named_H3K27ac_s5/*" -o ./named_H3K27ac_s5_index -s -f
Indexed 20384966 intervals.

建立完索引之后,分别比对,然后对比对结果文件进行合并。

(base) [xxzhang@cu08 H3K27ac]$ giggle search -i ./named_H3K27ac_s1_index/ -q Hs_repeat.bed.gz -s >Hs_repeat.bed.gz.giggle.H3K27ac_s1.result
(base) [xxzhang@cu08 H3K27ac]$ giggle search -i ./named_H3K27ac_s2_index/ -q Hs_repeat.bed.gz -s >Hs_repeat.bed.gz.giggle.H3K27ac_s2.result
(base) [xxzhang@cu08 H3K27ac]$ giggle search -i ./named_H3K27ac_s3_index/ -q Hs_repeat.bed.gz -s >Hs_repeat.bed.gz.giggle.H3K27ac_s3.result
(base) [xxzhang@cu08 H3K27ac]$ giggle search -i ./named_H3K27ac_s4_index/ -q Hs_repeat.bed.gz -s >Hs_repeat.bed.gz.giggle.H3K27ac_s4.result
(base) [xxzhang@cu08 H3K27ac]$ giggle search -i ./named_H3K27ac_s5_index/ -q Hs_repeat.bed.gz -s >Hs_repeat.bed.gz.giggle.H3K27ac_s5.result
(base) [xxzhang@cu08 H3K27ac]$ cat Hs_repeat.bed.gz.giggle.H3K27ac_s* >Hs_repeat.bed.gz.giggle.H3K27ac_all.result #这个回去修改删去其他文件来源的#file的首行(去冗余)
(base) [xxzhang@cu08 H3K27ac]$ awk '$8>0' Hs_repeat.bed.gz.giggle.H3K27ac_all.result >repeat_positive.H3K27ac.result

到这里的话,H3K27ac就解决了。
接下来继续弄剩下的三个。

  • H3K4me3 (启动子的那个)
(base) [xxzhang@cu08 human_histone_mark]$ mkdir H3K4me3
(base) [xxzhang@cu08 human_histone_mark]$ cp ./named_sort/H3K4me3* ./H3K4me3/
(base) [xxzhang@cu08 human_histone_mark]$ cd ./H3K4me3/
(base) [xxzhang@cu08 H3K4me3]$ mkdir named_H3K4me3_s1
(base) [xxzhang@cu08 H3K4me3]$ mkdir named_H3K4me3_s2
(base) [xxzhang@cu08 H3K4me3]$ mkdir named_H3K4me3_s3
(base) [xxzhang@cu08 H3K4me3]$ mkdir named_H3K4me3_s4
(base) [xxzhang@cu08 H3K4me3]$ ls -Q ./  |head -500 |xargs -i mv ./{} ./named_H3K4me3_s1/
ls: write error: Broken pipe
(base) [xxzhang@cu08 H3K4me3]$ ls -Q ./  |head -500 |xargs -i mv ./{} ./named_H3K4me3_s2/
ls: write error: Broken pipe
(base) [xxzhang@cu08 H3K4me3]$ ls -Q ./  |head -500 |xargs -i mv ./{} ./named_H3K4me3_s3/
ls: write error: Broken pipe
(base) [xxzhang@cu08 H3K4me3]$ mv ./*.gz ./named_H3K4me3_s4/
(base) [xxzhang@cu08 H3K4me3]$ giggle index -i "./named_H3K4me3_s1/*" -o ./named_H3K4me3_s1_index -s -f
Indexed 15307478 intervals.
(base) [xxzhang@cu08 H3K4me3]$ giggle index -i "./named_H3K4me3_s2/*" -o ./named_H3K4me3_s2_index -s -f
Indexed 14881229 intervals.
(base) [xxzhang@cu08 H3K4me3]$ giggle index -i "./named_H3K4me3_s3/*" -o ./named_H3K4me3_s3_index -s -f
Indexed 14168564 intervals.
(base) [xxzhang@cu08 H3K4me3]$ giggle index -i "./named_H3K4me3_s4/*" -o ./named_H3K4me3_s4_index -s -f
Indexed 27907381 intervals.
(base) [xxzhang@cu08 H3K4me3]$ cp ../Hs_repeat.bed.gz ./
(base) [xxzhang@cu08 H3K4me3]$ giggle search -i ./named_H3K4me3_s1_index/ -q Hs_repeat.bed.gz -s >Hs_repeat.bed.gz.giggle.H3K4me3_s1.result
(base) [xxzhang@cu08 H3K4me3]$ giggle search -i ./named_H3K4me3_s2_index/ -q Hs_repeat.bed.gz -s >Hs_repeat.bed.gz.giggle.H3K4me3_s2.result
(base) [xxzhang@cu08 H3K4me3]$ giggle search -i ./named_H3K4me3_s3_index/ -q Hs_repeat.bed.gz -s >Hs_repeat.bed.gz.giggle.H3K4me3_s3.result
(base) [xxzhang@cu08 H3K4me3]$ giggle search -i ./named_H3K4me3_s4_index/ -q Hs_repeat.bed.gz -s >Hs_repeat.bed.gz.giggle.H3K4me3_s4.result
(base) [xxzhang@cu08 H3K4me3]$ cat Hs_repeat.bed.gz.giggle.H3K4me3_s* >Hs_repeat.bed.gz.giggle.H3K4me3_all.result
(base) [xxzhang@cu08 H3K4me3]$ awk '$8>0' Hs_repeat.bed.gz.giggle.H3K4me3_all.result >repeat_positive.H3K4me3.result
  • H3K4me1 (增强子)
(base) [xxzhang@cu08 human_histone_mark]$ mkdir H3K4me1
(base) [xxzhang@cu08 human_histone_mark]$ cp ./named_sort/H3K4me1* ./H3K4me1/
(base) [xxzhang@cu08 human_histone_mark]$ cd ./H3K4me1/
(base) [xxzhang@cu08 H3K4me1]$ mkdir named_H3K4me1_s1
(base) [xxzhang@cu08 H3K4me1]$ mkdir named_H3K4me1_s2
(base) [xxzhang@cu08 H3K4me1]$ ls -Q ./  |head -500 |xargs -i mv ./{} ./named_H3K4me1_s1/
ls: write error: Broken pipe
(base) [xxzhang@cu08 H3K4me1]$ mv ./*.gz ./named_H3K4me1_s2/
(base) [xxzhang@cu08 H3K4me1]$ giggle index -i "./named_H3K4me1_s1/*" -o ./named_H3K4me1_s1_index -s -f
Indexed 35561267 intervals.
(base) [xxzhang@cu08 H3K4me1]$ giggle index -i "./named_H3K4me1_s2/*" -o ./named_H3K4me1_s2_index -s -f
Indexed 42409459 intervals.
(base) [xxzhang@cu08 H3K4me1]$ cp ../Hs_repeat.bed.gz ./
(base) [xxzhang@cu08 H3K4me1]$ giggle search -i ./named_H3K4me1_s1_index/ -q Hs_repeat.bed.gz -s >Hs_repeat.bed.gz.giggle.H3K4me1_s1.result
(base) [xxzhang@cu08 H3K4me1]$ giggle search -i ./named_H3K4me1_s2_index/ -q Hs_repeat.bed.gz -s >Hs_repeat.bed.gz.giggle.H3K4me1_s2.result
(base) [xxzhang@cu08 H3K4me1]$ cat Hs_repeat.bed.gz.giggle.H3K4me1_s* >Hs_repeat.bed.gz.giggle.H3K4me1_all.result
(base) [xxzhang@cu08 H3K4me1]$ awk '$8>0' Hs_repeat.bed.gz.giggle.H3K4me1_all.result >repeat_positive.H3K4me1.result
  • H3K27me3
(base) [xxzhang@cu08 human_histone_mark]$ mkdir H3K27me3
(base) [xxzhang@cu08 human_histone_mark]$ cp ./named_sort/H3K27me3* ./H3K27me3/
(base) [xxzhang@cu08 human_histone_mark]$ cd ./H3K27me3/
(base) [xxzhang@cu08 H3K27me3]$ mkdir named_H3K27me3_s1
(base) [xxzhang@cu08 H3K27me3]$ mkdir named_H3K27me3_s2
(base) [xxzhang@cu08 H3K27me3]$ mkdir named_H3K27me3_s3
(base) [xxzhang@cu08 H3K27me3]$ ls -Q ./  |head -500 |xargs -i mv ./{} ./named_H3K27me3_s1/
ls: write error: Broken pipe
(base) [xxzhang@cu08 H3K27me3]$ ls -Q ./  |head -500 |xargs -i mv ./{} ./named_H3K27me3_s2/
ls: write error: Broken pipe
(base) [xxzhang@cu08 H3K27me3]$ mv ./*.gz ./named_H3K27me3_s3/
(base) [xxzhang@cu08 H3K27me3]$ giggle index -i "./named_H3K27me3_s1/*" -o ./named_H3K27me3_s1_index -s -f
Indexed 5884451 intervals.
(base) [xxzhang@cu08 H3K27me3]$ giggle index -i "./named_H3K27me3_s2/*" -o ./named_H3K27me3_s2_index -s -f
Indexed 4270175 intervals.
(base) [xxzhang@cu08 H3K27me3]$ giggle index -i "./named_H3K27me3_s3/*" -o ./named_H3K27me3_s3_index -s -f
Indexed 4924467 intervals.
(base) [xxzhang@cu08 H3K27me3]$ cp ../Hs_repeat.bed.gz ./
(base) [xxzhang@cu08 H3K27me3]$ giggle search -i ./named_H3K27me3_s1_index/ -q Hs_repeat.bed.gz -s >Hs_repeat.bed.gz.giggle.H3K27me3_s1.result
(base) [xxzhang@cu08 H3K27me3]$ giggle search -i ./named_H3K27me3_s2_index/ -q Hs_repeat.bed.gz -s >Hs_repeat.bed.gz.giggle.H3K27me3_s2.result
(base) [xxzhang@cu08 H3K27me3]$ giggle search -i ./named_H3K27me3_s3_index/ -q Hs_repeat.bed.gz -s >Hs_repeat.bed.gz.giggle.H3K27me3_s3.result
(base) [xxzhang@cu08 H3K27me3]$ cat Hs_repeat.bed.gz.giggle.H3K27me3_s* >Hs_repeat.bed.gz.giggle.H3K27me3_all.result
(base) [xxzhang@cu08 H3K27me3]$ awk '$8>0' Hs_repeat.bed.gz.giggle.H3K27me3_all.result >repeat_positive.H3K27me3.result

另外,找到了方法之后,atac-seq的数据也想看一下。

(base) [xxzhang@cu08 human_chromAcc]$ mv named named_old
(base) [xxzhang@cu08 human_chromAcc]$ mkdir named
(base) [xxzhang@cu08 human_chromAcc]$ python rename.py -m ca_human_data_information_v2.txt -i ./human_ca -o ./named  -n cistrome_id_to_name_map_v2.txt
(base) [xxzhang@cu08 human_chromAcc]$ chmod 777 sort_bed
(base) [xxzhang@cu08 human_chromAcc]$ ./sort_bed "./named/[A-J]*" ./named_sort/ 30
(base) [xxzhang@cu08 named_sort]$ mv *.gz ./named_ca_s5/
(base) [xxzhang@cu08 named_sort]$ giggle index -i "./named_ca_s1/*" -o ./named_ca_index -s -f
Indexed 15441476 intervals.
(base) [xxzhang@cu08 named_sort]$ giggle index -i "./named_ca_s2/*" -o ./named_ca_s2_index -s -f
Indexed 13163706 intervals.
(base) [xxzhang@cu08 named_sort]$ giggle index -i "./named_ca_s3/*" -o ./named_ca_s3_index -s -f
Indexed 32631025 intervals.
(base) [xxzhang@cu08 named_sort]$ giggle index -i "./named_ca_s4/*" -o ./named_ca_s4_index -s -f
Indexed 55888342 intervals.
(base) [xxzhang@cu08 named_sort]$ giggle index -i "./named_ca_s5/*" -o ./named_ca_s5_index -s -f
Indexed 39325800 intervals.
(base) [xxzhang@cu08 named_sort]$ giggle search -i ./named_ca_s1_index/ -q Hs_repeat.bed.gz -s >Hs_repeat.bed.gz.giggle.ca_s1.result
(base) [xxzhang@cu08 named_sort]$ giggle search -i ./named_ca_s2_index/ -q Hs_repeat.bed.gz -s >Hs_repeat.bed.gz.giggle.ca_s2.result
(base) [xxzhang@cu08 named_sort]$ giggle search -i ./named_ca_s3_index/ -q Hs_repeat.bed.gz -s >Hs_repeat.bed.gz.giggle.ca_s3.result
(base) [xxzhang@cu08 named_sort]$ giggle search -i ./named_ca_s4_index/ -q Hs_repeat.bed.gz -s >Hs_repeat.bed.gz.giggle.ca_s4.result
(base) [xxzhang@cu08 named_sort]$ giggle search -i ./named_ca_s5_index/ -q Hs_repeat.bed.gz -s >Hs_repeat.bed.gz.giggle.ca_s5.result
(base) [xxzhang@cu08 named_sort]$ cat Hs_repeat.bed.gz.giggle.ca_s* Hs_repeat.bed.gz.giggle.ca_all.result


把上述这些处理完之后,现在在想如何用合适的图把数据展示出来。

(base) [xxzhang@cu08 positive]$ sort -n -k 8nr repeat_positive.ca.result >repeat_positive.ca.sort.result
(base) [xxzhang@cu08 positive]$ sort -n -k 8nr repeat_positive.H3K4me1.result >repeat_positive.H3K4me1.sort.result
(base) [xxzhang@cu08 positive]$ sort -n -k 8nr repeat_positive.H3K4me3.result >repeat_positive.H3K4me3.sort.result
(base) [xxzhang@cu08 positive]$ sort -n -k 8nr repeat_positive.H3K36me3.result >repeat_positive.H3K36me3.sort.result
(base) [xxzhang@cu08 positive]$ sort -n -k 8nr repeat_positive.H3K9me3.result >repeat_positive.H3K9me3.sort.result
(base) [xxzhang@cu08 positive]$ sort -n -k 8nr repeat_positive.H3K27me3.result >repeat_positive.H3K27me3.sort.result
(base) [xxzhang@cu08 positive]$ sort -n -k 8nr repeat_positive.H3K27ac.result >repeat_positive.H3K27ac.sort.result

posted on 2022-08-22 00:07  孟灵己  阅读(109)  评论(0编辑  收藏  举报