Scanpy源码浅析之tl.score_genes_cell_cycle
版本
导入Scanpy, 其版本为'1.9.1'。
import scanpy as sc
sc.__version__
#'1.9.1'
功能
函数tl.score_genes_cell_cycle
用给定S phase 和 G2M phase的两个基因集,计算打分,然后根据得分分配细胞phase, 其源代码在scanpy/tools/_score_genes.py
关于打分细节,见上一篇https://www.yuque.com/huangsh/lq16ea/zbvnce
参数设置:
adata
: AnnData 对象s_gene
S phase gene listg2m_gene
G2M phase gene listcopy
是否复制一个新adata**kwargs
传递给tl.score_genes
的其他参数。ctrl_size
已经固定为:min(len(s_genes), len(g2m_genes))
代码解析
全部代码
def score_genes_cell_cycle(
adata: AnnData,
s_genes: Sequence[str],
g2m_genes: Sequence[str],
copy: bool = False,
**kwargs,
) -> Optional[AnnData]:
logg.info('calculating cell cycle phase')
adata = adata.copy() if copy else adata
ctrl_size = min(len(s_genes), len(g2m_genes))
# add s-score
score_genes(
adata, gene_list=s_genes, score_name='S_score', ctrl_size=ctrl_size, **kwargs
)
# add g2m-score
score_genes(
adata,
gene_list=g2m_genes,
score_name='G2M_score',
ctrl_size=ctrl_size,
**kwargs,
)
scores = adata.obs[['S_score', 'G2M_score']]
# default phase is S
phase = pd.Series('S', index=scores.index)
# if G2M is higher than S, it's G2M
phase[scores.G2M_score > scores.S_score] = 'G2M'
# if all scores are negative, it's G1...
phase[np.all(scores < 0, axis=1)] = 'G1'
adata.obs['phase'] = phase
logg.hint(' \'phase\', cell cycle phase (adata.obs)')
return adata if copy else None
选择ctrl_size
以s_genes, g2m_genes中最小数目作为ctrl_size
ctrl_size = min(len(s_genes), len(g2m_genes))
计算 s-score,g2m-score
score_genes
上一篇已经写过。
# add s-score
score_genes(
adata, gene_list=s_genes, score_name='S_score', ctrl_size=ctrl_size, **kwargs
)
# add g2m-score
score_genes(
adata,
gene_list=g2m_genes,
score_name='G2M_score',
ctrl_size=ctrl_size,
**kwargs,
)
分配phase
默认所有细胞都为S phase,
如果计算G2M_score > S_score, 则归类为G2M phase
如果'S_score', 'G2M_score' < 0, 则则归类为G1 phase
# 取出 'S_score', 'G2M_score'
scores = adata.obs[['S_score', 'G2M_score']]
# default phase is S
phase = pd.Series('S', index=scores.index)
# if G2M is higher than S, it's G2M
phase[scores.G2M_score > scores.S_score] = 'G2M'
# if all scores are negative, it's G1...
phase[np.all(scores < 0, axis=1)] = 'G1'
adata.obs['phase'] = phase
logg.hint(' \'phase\', cell cycle phase (adata.obs)')
return adata if copy else None