单细胞转录组CNV分析 | inferCNV | epiAneufinder | Gene amplification (copy number gain) | Gene deletion (copy number loss)
2023年11月09日
明确定义 CNV有两个方向:gain和loss
Gene amplification (copy number gain) and deletion (copy number loss) are common in cancer cells and contribute to cancer cell growth, drug sensitivity and resistance.
The Importance of Detecting Copy Number Variants (CNVs) in the Cancer Genome
2023年10月18日
epiAneufinder
- https://www.biorxiv.org/content/10.1101/2022.04.03.485795v1.full
- https://github.com/colomemaria/epiAneufinder
2023年01月19日
安装 (https://github.com/broadinstitute/infercnv)
1 2 3 4 5 6 | conda install -c conda-forge r-rjags if (! require ( "BiocManager" , quietly = TRUE )) install.packages ( "BiocManager" ) BiocManager:: install ( "infercnv" ) |
1 | conda install -c conda-forge r-magick |
分析目录:projects/ApcKO_multiomics/ApcKO_CNV/ApcKO-CNV.ipynb
终于找到一篇个性化的CNV分析教程:
什么是CNV?
copy number variation (CNV)
A copy number variation (CNV) is when the number of copies of a particular gene varies from one individual to the next. Following the completion of the Human Genome Project, it became apparent that the genome experiences gains and losses of genetic material. The extent to which copy number variation contributes to human disease is not yet known. It has long been recognized that some cancers are associated with elevated copy numbers of particular genes.
https://github.com/broadinstitute/inferCNV
想了解算法,可以看作者的PhD thesis:Bioinformatic tool developments with applications to RNA-seq data analysis and clinical cancer research - BRIAN JOHN HAAS - 2021
教程:InferCNV: Inferring copy number alterations from tumor single cell RNA-Seq data
安装
1 2 3 4 5 6 7 | conda install -c conda-forge jags if (!requireNamespace( "BiocManager" , quietly = TRUE)) install .packages( "BiocManager" ) BiocManager:: install ( "infercnv" ) library(infercnv) |
下载github仓库测试数据
1 2 3 | git clone https: //github .com /broadinstitute/infercnv .git cd inferCNV /example Rscript . /run .R |
进入服务器jupyter,查看和准备输入文件:
1 2 3 4 5 6 | # create the infercnv object infercnv_obj = CreateInfercnvObject (raw_counts_matrix= system.file ( "extdata" , "oligodendroglioma_expression_downsampled.counts.matrix.gz" , package = "infercnv" ), annotations_file= system.file ( "extdata" , "oligodendroglioma_annotations_downsampled.txt" , package = "infercnv" ), delim= "\t" , gene_order_file= system.file ( "extdata" , "gencode_downsampled.EXAMPLE_ONLY_DONT_REUSE.txt" , package = "infercnv" ), ref_group_names= c ( "Microglia/Macrophage" , "Oligodendrocytes (non-malignant)" )) |
1 2 3 | system.file ( "extdata" , "oligodendroglioma_expression_downsampled.counts.matrix.gz" , package = "infercnv" ) system.file ( "extdata" , "oligodendroglioma_annotations_downsampled.txt" , package = "infercnv" ) system.file ( "extdata" , "gencode_downsampled.EXAMPLE_ONLY_DONT_REUSE.txt" , package = "infercnv" ) |
因为R版本不够,不能在本地装上最新的版本,可以用容器代替。
1 2 3 4 5 | module load singularity singularity build infercnv.latest.simg docker://trinityctat/infercnv:latest singularity exec -e -B `pwd` infercnv.latest.simg Rscript run.R |
需要运行的R脚本
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 | #!/usr/bin/env Rscript options (error = function () traceback (2)) packageVersion ( "infercnv" ) library ( "infercnv" ) # create the infercnv object infercnv_obj = CreateInfercnvObject (raw_counts_matrix= "~/project/scPipeline/infercnv/cell.count.matrix.txt.gz" , annotations_file= "~/project/scPipeline/infercnv/cell.anno.txt" , delim= "\t" , gene_order_file= "~/project/scPipeline/infercnv/gene.anno.txt" , ref_group_names= c ( "IMR_ENCC" , "UE_ENCC" )) out_dir= "HSCR_CNV" # perform infercnv operations to reveal cnv signal infercnv_obj = infercnv:: run (infercnv_obj, cutoff=1, # cutoff=1 works well for Smart-seq2, and cutoff=0.1 works well for 10x Genomics out_dir=out_dir, cluster_by_groups= TRUE , plot_steps= FALSE , denoise= TRUE , HMM= TRUE , num_threads=10 ) |
可能不是癌症样本,结果没有那么突出。
结果解读
把基因按染色体的坐标排列,如果某个染色体片段的表达显著提高或减少,则说明其CNV发生了变化。
参考文章:Single-cell RNA-seq highlights intratumoral heterogeneity in primary glioblastoma
结果描述:
Normalization of CNV profiles using signal from the ‘normal’ cluster revealed coherent chromosomal aberrations in each tumor (Fig. 1C). Gain of chromosome 7 and loss of chromosome 10, the two most common genetic alterations in glioblastoma (20), were consistently inferred in every tumor cell. Chromosomal aberrations were relatively consistent within tumors, with the exception that MGH31 appears to contain two genetic clones with discordant copy number changes on chromosomes 5, 13 and 14. While this data suggests largescale intratumoral genetic homogeneity, we recognize that heterogeneity generated by focal alterations and point mutations will be grossly underappreciated using this method. Nevertheless, such panoramic analysis of chromosomal landscape effectively separated normal from malignant cells.
待续~
参考:
使用broad出品的inferCNV来对单细胞转录组数据推断CNV信息
The Trinity Cancer Transcriptome Analysis Toolkit (CTAT)
【推荐】国内首个AI IDE,深度理解中文开发场景,立即下载体验Trae
【推荐】编程新体验,更懂你的AI,立即体验豆包MarsCode编程助手
【推荐】抖音旗下AI助手豆包,你的智能百科全书,全免费不限次数
【推荐】轻量又高性能的 SSH 工具 IShell:AI 加持,快人一步
· AI与.NET技术实操系列(二):开始使用ML.NET
· 记一次.NET内存居高不下排查解决与启示
· 探究高空视频全景AR技术的实现原理
· 理解Rust引用及其生命周期标识(上)
· 浏览器原生「磁吸」效果!Anchor Positioning 锚点定位神器解析
· 全程不用写代码,我用AI程序员写了一个飞机大战
· DeepSeek 开源周回顾「GitHub 热点速览」
· 记一次.NET内存居高不下排查解决与启示
· MongoDB 8.0这个新功能碉堡了,比商业数据库还牛
· .NET10 - 预览版1新功能体验(一)
2019-05-07 HSCR | Hirschsprung‘s disease | 巨结肠 | 研究进展