单细胞转录组CNV分析 | inferCNV | epiAneufinder | Gene amplification (copy number gain) | Gene deletion (copy number loss)
明确定义 CNV有两个方向:gain和loss
Gene amplification (copy number gain) and deletion (copy number loss) are common in cancer cells and contribute to cancer cell growth, drug sensitivity and resistance.
The Importance of Detecting Copy Number Variants (CNVs) in the Cancer Genome
- https://www.biorxiv.org/content/10.1101/2022.04.03.485795v1.full
- https://github.com/colomemaria/epiAneufinder
安装 (https://github.com/broadinstitute/infercnv)
conda install -c conda-forge r-rjags if (!require("BiocManager", quietly = TRUE)) install.packages("BiocManager") BiocManager::install("infercnv")
conda install -c conda-forge r-magick
copy number variation (CNV)
A copy number variation (CNV) is when the number of copies of a particular gene varies from one individual to the next. Following the completion of the Human Genome Project, it became apparent that the genome experiences gains and losses of genetic material. The extent to which copy number variation contributes to human disease is not yet known. It has long been recognized that some cancers are associated with elevated copy numbers of particular genes.
想了解算法,可以看作者的PhD thesis:Bioinformatic tool developments with applications to RNA-seq data analysis and clinical cancer research - BRIAN JOHN HAAS - 2021
教程:InferCNV: Inferring copy number alterations from tumor single cell RNA-Seq data
conda install -c conda-forge jags if (!requireNamespace("BiocManager", quietly = TRUE)) install.packages("BiocManager") BiocManager::install("infercnv") library(infercnv)
git clone https://github.com/broadinstitute/infercnv.git cd inferCNV/example Rscript ./run.R
# create the infercnv object infercnv_obj = CreateInfercnvObject(raw_counts_matrix=system.file("extdata", "oligodendroglioma_expression_downsampled.counts.matrix.gz", package = "infercnv"), annotations_file=system.file("extdata", "oligodendroglioma_annotations_downsampled.txt", package = "infercnv"), delim="\t", gene_order_file=system.file("extdata", "gencode_downsampled.EXAMPLE_ONLY_DONT_REUSE.txt", package = "infercnv"), ref_group_names=c("Microglia/Macrophage","Oligodendrocytes (non-malignant)"))
system.file("extdata", "oligodendroglioma_expression_downsampled.counts.matrix.gz", package = "infercnv") system.file("extdata", "oligodendroglioma_annotations_downsampled.txt", package = "infercnv") system.file("extdata", "gencode_downsampled.EXAMPLE_ONLY_DONT_REUSE.txt", package = "infercnv")
module load singularity singularity build infercnv.latest.simg docker://trinityctat/infercnv:latest singularity exec -e -B `pwd` infercnv.latest.simg Rscript run.R
#!/usr/bin/env Rscript options(error = function() traceback(2)) packageVersion("infercnv") library("infercnv") # create the infercnv object infercnv_obj = CreateInfercnvObject(raw_counts_matrix="~/project/scPipeline/infercnv/cell.count.matrix.txt.gz", annotations_file="~/project/scPipeline/infercnv/cell.anno.txt", delim="\t", gene_order_file="~/project/scPipeline/infercnv/gene.anno.txt", ref_group_names=c("IMR_ENCC","UE_ENCC")) out_dir="HSCR_CNV" # perform infercnv operations to reveal cnv signal infercnv_obj = infercnv::run(infercnv_obj, cutoff=1, # cutoff=1 works well for Smart-seq2, and cutoff=0.1 works well for 10x Genomics out_dir=out_dir, cluster_by_groups=TRUE, plot_steps=FALSE, denoise=TRUE, HMM=TRUE, num_threads=10 )
参考文章:Single-cell RNA-seq highlights intratumoral heterogeneity in primary glioblastoma
Normalization of CNV profiles using signal from the ‘normal’ cluster revealed coherent chromosomal aberrations in each tumor (Fig. 1C). Gain of chromosome 7 and loss of chromosome 10, the two most common genetic alterations in glioblastoma (20), were consistently inferred in every tumor cell. Chromosomal aberrations were relatively consistent within tumors, with the exception that MGH31 appears to contain two genetic clones with discordant copy number changes on chromosomes 5, 13 and 14. While this data suggests largescale intratumoral genetic homogeneity, we recognize that heterogeneity generated by focal alterations and point mutations will be grossly underappreciated using this method. Nevertheless, such panoramic analysis of chromosomal landscape effectively separated normal from malignant cells.
The Trinity Cancer Transcriptome Analysis Toolkit (CTAT)