plink 软件中 --recode 01、--recode 12、--output-missing-genotype的作用
1、准备测试数据,8个样本,8个位点
[root@linuxprobe test]# cat outcome.ped
DOR 1 0 0 0 -9 A G G G G G G C G G C C C C 0 0
DOR 2 0 0 0 -9 G G G G A G C C G G G C C C 0 0
DOR 3 0 0 0 -9 G G G G A G G C C G C C G G 0 0
DOR 4 0 0 0 -9 G G G G G G G G C G G G G G G G
DOR 5 0 0 0 -9 G G G G A G G C G G C C G G G G
DOR 6 0 0 0 -9 G G G G A A C C G G C C G G G G
DOR 7 0 0 0 -9 A A G G G G C C G G C C G G A G
DOR 9 0 0 0 -9 A A G G G G C C G G C C G G A G
[root@linuxprobe test]# cat outcome.map
1 snp1 0 55910
1 snp2 0 85204
1 snp3 0 122948
1 snp4 0 203750
1 snp5 0 312707
1 snp6 0 356863
1 snp7 0 400518
1 snp8 0 487423
2、 --recode 01
plink --file outcome --recode 12 --out test;rm *.log *.nosex ## --recode 12 的作用是将次等位基因转换为1,主等位基因转换为2,缺失基因型扔为0
[root@linuxprobe test]# cat test.map
1 snp1 0 55910
1 snp2 0 85204
1 snp3 0 122948
1 snp4 0 203750
1 snp5 0 312707
1 snp6 0 356863
1 snp7 0 400518
1 snp8 0 487423
[root@linuxprobe test]# cat test.ped
DOR 1 0 0 0 -9 1 2 2 2 2 2 1 2 2 2 2 2 1 1 0 0
DOR 2 0 0 0 -9 2 2 2 2 1 2 2 2 2 2 1 2 1 1 0 0
DOR 3 0 0 0 -9 2 2 2 2 1 2 1 2 1 2 2 2 2 2 0 0
DOR 4 0 0 0 -9 2 2 2 2 2 2 1 1 1 2 1 1 2 2 2 2
DOR 5 0 0 0 -9 2 2 2 2 1 2 1 2 2 2 2 2 2 2 2 2
DOR 6 0 0 0 -9 2 2 2 2 1 1 2 2 2 2 2 2 2 2 2 2
DOR 7 0 0 0 -9 1 1 2 2 2 2 2 2 2 2 2 2 2 2 1 2
DOR 9 0 0 0 -9 1 1 2 2 2 2 2 2 2 2 2 2 2 2 1 2
3、--recode 01
[root@linuxprobe test]# plink --file outcome --recode 01 --out test;rm *.log *.nosex ## 直接使用,报错了
PLINK v1.90b6.19 64-bit (16 Sep 2020) www.cog-genomics.org/plink/1.9/
(C) 2005-2020 Shaun Purcell, Christopher Chang GNU General Public License v3
Logging to test.log.
Options in effect:
--file outcome
--out test
--recode 01
23700 MB RAM detected; reserving 11850 MB for main workspace.
.ped scan complete (for binary autoconversion).
Performing single-pass .bed write (8 variants, 8 people).
--file: test-temporary.bed + test-temporary.bim + test-temporary.fam written.
8 variants loaded from .bim file.
8 people (0 males, 0 females, 8 ambiguous) loaded from .fam.
Ambiguous sex IDs written to test.nosex .
Using 1 thread (no multithreaded calculations invoked).
Before main variant filters, 8 founders and 0 nonfounders present.
Calculating allele frequencies... done.
Total genotyping rate is 0.953125.
8 variants and 8 people pass filters and QC.
Note: No phenotypes present.
Error: The --recode '01' modifier normally has to be used with a nonzero
--output-missing-genotype setting.
4、--recode 01 + --output-missing-genotype
plink --file outcome --recode 01 --output-missing-genotype 9 --out test;rm *.log *.nosex ## 加参数 --output-missing-genotype [root@linuxprobe test]# cat test.ped ## 此等位基因变为0,主等位基因变为1,缺失基因型变为9 DOR 1 0 0 0 -9 0 1 1 1 1 1 0 1 1 1 1 1 0 0 9 9 DOR 2 0 0 0 -9 1 1 1 1 0 1 1 1 1 1 0 1 0 0 9 9 DOR 3 0 0 0 -9 1 1 1 1 0 1 0 1 0 1 1 1 1 1 9 9 DOR 4 0 0 0 -9 1 1 1 1 1 1 0 0 0 1 0 0 1 1 1 1 DOR 5 0 0 0 -9 1 1 1 1 0 1 0 1 1 1 1 1 1 1 1 1 DOR 6 0 0 0 -9 1 1 1 1 0 0 1 1 1 1 1 1 1 1 1 1 DOR 7 0 0 0 -9 0 0 1 1 1 1 1 1 1 1 1 1 1 1 0 1 DOR 9 0 0 0 -9 0 0 1 1 1 1 1 1 1 1 1 1 1 1 0 1 [root@linuxprobe test]# cat test.map 1 snp1 0 55910 1 snp2 0 85204 1 snp3 0 122948 1 snp4 0 203750 1 snp5 0 312707 1 snp6 0 356863 1 snp7 0 400518 1 snp8 0 487423
结论:--recode 12 :将次等位基因变为1,主等位基因变为2
--recode 01 :需结合--output-missing-genotype使用,将次等位基因变为0,主等位基因变为1,--output-missing-genotype作用是设定缺失基因型的代表字符。