[TIPS]BEDtools -f -split options produce empty results

Questions: 

could you give an example of intersectBed -split syntax working with the current version of bedtools (v2.16.2)?

My input A is a BED12 containing only reads that have been aligned to *exon-exon junctions*.
My input B is a gtf annotation file.

Since input A contains only reads that have been mapped to splice junctions, I need to be strict about the degree of overlap, therefore I use -f 1.
With a previous version of bedtools (2.11), I was using the following:
$ intersectBed -a myAlignment.bed12 -b myAnnotation.gtf -wa -f 1  -wb  -bed -split > myOutput.hit

The same syntax unfortunately now returns an empty output file.

Only when being less strict with the -f option I'm getting some hits (same as above, but with -f 0.55 or just skipping the -f restriction).

But since my alignment file contains only reads falling into splice junctions, I would like to be as strict as possible.
 
Answers: 
 
The basic issue is that the behavior of the -f argument has changed in the latest version when dealing with -split records.  Let's consider a toy example where we have split record with 3 "blocks" that are each 10 base pairs.  Previously, if any of the 3 blocks overlapped something in the B file that exceeded the -f parameter, then the record was reported.  Currently, the _total_ number of base pairs intersecting _each_ block must be greater than -f percent of the _total_ length of the 3 blocks.  In other words, whereas a single feature that overlapped 100% of one block would now only represent a 33% overlap with the blocked feature.
 
I recognize that this is the right way to look at things in some cases and not in others.  I will give this some thought and am welcome to input from the list.  Blocked/split records are difficult to handle with the existing infrastructure...
 
In tracking this down, I did uncover a bug that reduced the number of B intervals that were considered for potential -split overlaps.  This has been addressed and corrected in the github repository.
posted @ 2012-09-15 14:51  Puriney  阅读(315)  评论(0编辑  收藏  举报