KOBAS
1. What is KOBAS 3.0?
KOBAS (KEGG Orthology Based Annotation System) is a web server for gene/protein functional annotation (Annotation module) and functional set enrichment (Enrichment module). Given a set of genes or protein, it can determine whether a pathway, disease, and Gene Ontology(GO) term shows statistically significant. The last version of KOBAS, KOBAS 2.0, has abundant annotation information of gene sets from multiple databases covering pathways (KEGG PATHWAY, Reactome, Biocyc, Panther), diseases (KEGG DISEASE, OMIM, NHGRI GWAS Catalog), and GO terms, and more than 4,000 species are supported. Since KOBAS 2.0 is widely used by worldwide researchers, we update it to KOBAS 3.0, which supports more data formats as input and more accurate functional enrichment algorithms.
KOBAS 3.0 is composed by two function, Annotation and Enrichment, as follows:
1.1 Annotation
For Annotation module, it accepts gene/protein list as input, including IDs or sequences. And it generates annotations for each gene based on multiple databases about pathways, diseases, and Gene Ontology. That is, for each gene, you can find which pathways, diseases, and Gene Ontology are related to this gene.
1.2 Enrichment
Enrichment module gives you the answer of which pathways, diseases, and GO terms is statistically significant associated with the genes/proteins you just input.
For Enrichment module, there are two modules according to their differences in input format:
1.2.1 Gene list Enrichment
This module is called “Identify” in KOBAS 2.0. It accepts same input formats as Annotation module, and the results of Annotation module as input is also allowed (see details at 3.1). It is based on the first generation gene set enrichment method, a gene-level statistic called Overrepresentation Analysis(ORA), a simple and frequently used test based on the hypergeometric distribution. Many tools have applied this methods, such as DAVID. However, we support other distributions like binominal test, chi-square test, frequency list and 3 FDR correction methods, like Benjamini and Hochberg (1995), Benjamini and Yekutieli (2001), and QVALUE.
1.2.2 Exp-data Enrichment
This module is a new feature in KOBAS 3.0. Allowing the gene expression as input gives a big change for functional gene sets enrichment because it makes us be able to use set based second or net-based gene set enrichment method, which use the information of molecular measures where the ORA ignores. By considering the coordinated changes in gene expression, these methods account for dependence between genes in a pathway, which ORA does not.
This module has integrated 9 methods including set-based methods: Globaltest, GSEA, GSA PADOG, PLAGE, GAGE, SAFE and net-based methods: GANPA, CEPA.
Furthermore, to detect the enriched gene sets supported by multiple methods, Exp-data Enrichment module gives gene set enrichment score and probability of being enriched sets based on the results of 9 gene set enrichment(GSE) methods.