Loading

Scanpy源码浅析之tl.score_genes_cell_cycle

版本

导入Scanpy, 其版本为'1.9.1'。

import scanpy as sc

sc.__version__
#'1.9.1'

功能

函数tl.score_genes_cell_cycle用给定S phase 和 G2M phase的两个基因集,计算打分,然后根据得分分配细胞phase, 其源代码在scanpy/tools/_score_genes.py
关于打分细节,见上一篇https://www.yuque.com/huangsh/lq16ea/zbvnce

参数设置:

  • adata: AnnData 对象
  • s_gene S phase gene list
  • g2m_geneG2M phase gene list
  • copy是否复制一个新adata
  • **kwargs 传递给 tl.score_genes的其他参数。ctrl_size已经固定为:
    • min(len(s_genes), len(g2m_genes))

代码解析

全部代码

def score_genes_cell_cycle(
    adata: AnnData,
    s_genes: Sequence[str],
    g2m_genes: Sequence[str],
    copy: bool = False,
    **kwargs,
) -> Optional[AnnData]:

    logg.info('calculating cell cycle phase')

    adata = adata.copy() if copy else adata
    ctrl_size = min(len(s_genes), len(g2m_genes))
    # add s-score
    score_genes(
        adata, gene_list=s_genes, score_name='S_score', ctrl_size=ctrl_size, **kwargs
    )
    # add g2m-score
    score_genes(
        adata,
        gene_list=g2m_genes,
        score_name='G2M_score',
        ctrl_size=ctrl_size,
        **kwargs,
    )
    scores = adata.obs[['S_score', 'G2M_score']]

    # default phase is S
    phase = pd.Series('S', index=scores.index)

    # if G2M is higher than S, it's G2M
    phase[scores.G2M_score > scores.S_score] = 'G2M'

    # if all scores are negative, it's G1...
    phase[np.all(scores < 0, axis=1)] = 'G1'

    adata.obs['phase'] = phase
    logg.hint('    \'phase\', cell cycle phase (adata.obs)')
    return adata if copy else None

选择ctrl_size

以s_genes, g2m_genes中最小数目作为ctrl_size

ctrl_size = min(len(s_genes), len(g2m_genes))

计算 s-score,g2m-score

score_genes上一篇已经写过。

    # add s-score
    score_genes(
        adata, gene_list=s_genes, score_name='S_score', ctrl_size=ctrl_size, **kwargs
    )
    # add g2m-score
    score_genes(
        adata,
        gene_list=g2m_genes,
        score_name='G2M_score',
        ctrl_size=ctrl_size,
        **kwargs,
    )

分配phase

默认所有细胞都为S phase,
如果计算G2M_score > S_score, 则归类为G2M phase
如果'S_score', 'G2M_score' < 0, 则则归类为G1 phase

	# 取出 'S_score', 'G2M_score'
	scores = adata.obs[['S_score', 'G2M_score']]

    # default phase is S
    phase = pd.Series('S', index=scores.index)

    # if G2M is higher than S, it's G2M
    phase[scores.G2M_score > scores.S_score] = 'G2M'

    # if all scores are negative, it's G1...
    phase[np.all(scores < 0, axis=1)] = 'G1'

    adata.obs['phase'] = phase
    logg.hint('    \'phase\', cell cycle phase (adata.obs)')
    return adata if copy else None

posted @ 2022-09-11 21:18  何物昂  阅读(899)  评论(0编辑  收藏  举报