featureCounts 安装和使用

官网：http://subread.sourceforge.net/

Subread package: high-performance read alignment, quantification and mutation discovery

The Subread package comprises a suite of software programs for processing next-gen sequencing read data including:

Subread: a general-purpose read aligner which can align both genomic DNA-seq and RNA-seq reads. It can also be used to discover genomic mutations including short indels and structural variants.
Subjunc: a read aligner developed for aligning RNA-seq reads and for the detection of exon-exon junctions. Gene fusion events can be detected as well.
featureCounts: a software program developed for counting reads to genomic features such as genes, exons, promoters and genomic bins.
Sublong: a long-read aligner that is designed based on seed-and-vote.
exactSNP: a SNP caller that discovers SNPs by testing signals against local background noises.

These programs were also implemented in Bioconductor R package Rsubread.

下载安装：

https://sourceforge.net/projects/subread/files/

解压完成即可使用，可执行程序在bin目录

wget https://jaist.dl.sourceforge.net/project/subread/subread-2.0.2/subread-2.0.2-Linux-x86_64.tar.gz

tar -zxvf subread-2.0.2-Linux-x86_64.tar.gz

cd subread-2.0.2-Linux-x86_64

cd bin

五、软件使用：

基本表达式

featureCounts [options] <input.file>

参数说明

参数	说明
input file	输入的bam/sam文件，支持多个文件输入
-a < string >	参考gtf文件名，支持Gzipped文件格式
-F	参考文件的格式，一般为GTF/SAF，C语言版本默认的格式为GTF格式
-A	提供一个逗号分割为两列的文件，一列为gtf中的染色体名，另一列为read中对应的染色体名，用于将gtf和read中的名称进行统一匹配，注意该文件提交时不需要列名
-J	对可变剪切进行计数
-G < string >	当-J设置的时候，通过-G提供一个比对的时候使用的参考基因组文件，辅助寻找可变剪切
-M	如果设置-M，多重map的read将会被统计到
-o < string >	输出文件的名字，输出文件的内容为read 的统计数目
-O	允许多重比对，即当一个read比对到多个feature或多个metafeature的时候，这条read会被统计多次
-T	线程数目，1~32
下面是有关featrue/metafeature选择的参数	参数说明
-p	只能用在paired-end的情况中，会统计fragment而不统计read
-B	在-p选择的条件下，只有两端read都比对上的fragment才会被统计
-C	如果-C被设置，那融合的fragment（比对到不同染色体上的fragment）就不会被计数，这个只有在-p被设置的条件下使用
-d < int >	最短的fragment，默认是50
-D < int >	最长的fragmen，默认是600
-f	如果-f被设置，那将会统计feature层面的数据，如exon-level，否则会统计meta-feature层面的数据，如gene-levels
-g < string >	当参考的gtf提供的时候，我们需要提供一个id identifier 来将feature水平的统计汇总为meta-feature水平的统计，默认为gene_id，注意！选择gtf中提供的id identifier！！！
-t < string >	设置feature-type，-t指定的必须是gtf中有的feature，同时read只有落到这些feature上才会被统计到，默认是“exon”

使用示例：

$ /home/software/subread-2.0.2-Linux-x86_64/bin/featureCounts -T 5 -t exon -g gene_id -a /path-to-gtf/ERCC.gtf -o /path-to-output/all.id.txt *.bam 1>counts.id.log 2>&1

链接：https://www.jianshu.com/p/9cc4e8657d62

链接：https://www.jianshu.com/p/b3d1023d9017

posted @ 2021-06-12 07:49 emanlee 阅读(7109) 评论(0) 收藏举报

刷新页面返回顶部