plink 阈性状(质量性状)GWAS分析

plink进行阈性状(质量性状)关联分析有两种方法

--assoc 和 --logistic,

这两种方法又分别有分是否校正:

--assoc --adjust 和 --logistic --adjust

--logistic方法有可以选择是否添加协变量:

--logistic --covar 和 --logistic --adjust --covar

 

这么算的话一共有6种组合。

 

1、测试数据

复制代码
root@PC1:/home/test3# ls
qc3.map  qc3.ped
root@PC1:/home/test3# wc -l *   ## 288样本, 417105位点
   417105 qc3.map
      288 qc3.ped
   426702 total
root@PC1:/home/test3# head -n 5 qc3.map
1       1:3174  0       3174
1       1:32399 0       32399
1       1:32402 0       32402
1       1:32406 0       32406
1       1:32443 0       32443
root@PC1:/home/test3# cut -f 1-10 qc3.ped | head -n 5      ## ped文件的第5列性别,第6列是表型数据,1或者2
1       1       0       0       1       1       T       C       T       C
2       2       0       0       1       1       C       C       C       C
3       3       0       0       1       1       C       C       C       C
4       4       0       0       1       2       C       C       C       C
5       5       0       0       1       1       C       C       C       C
复制代码

 

 

2、 --assoc

复制代码
root@PC1:/home/test3# ls
qc3.map  qc3.ped
root@PC1:/home/test3# plink --file qc3 --assoc --out test    ## 直接使用assoc命令
PLINK v1.90b6.24 64-bit (6 Jun 2021)           www.cog-genomics.org/plink/1.9/
(C) 2005-2021 Shaun Purcell, Christopher Chang   GNU General Public License v3
Logging to test.log.
Options in effect:
  --assoc
  --file qc3
  --out test

15974 MB RAM detected; reserving 7987 MB for main workspace.
.ped scan complete (for binary autoconversion).
Performing single-pass .bed write (417105 variants, 288 people).
--file: test-temporary.bed + test-temporary.bim + test-temporary.fam written.
417105 variants loaded from .bim file.
288 people (288 males, 0 females) loaded from .fam.
288 phenotype values loaded from .fam.
Using 1 thread (no multithreaded calculations invoked).
Before main variant filters, 288 founders and 0 nonfounders present.
Calculating allele frequencies... done.
Total genotyping rate is 0.948564.
417105 variants and 288 people pass filters and QC.
Among remaining phenotypes, 75 are cases and 213 are controls.
Writing C/C --assoc report to test.assoc ... done.
root@PC1:/home/test3# ls
qc3.map  qc3.ped  test.assoc  test.log
root@PC1:/home/test3# wc -l *
   417105 qc3.map
      288 qc3.ped
   417106 test.assoc
       27 test.log
   834526 total
root@PC1:/home/test3# head -n 3 test.assoc  ## 查看结果
 CHR         SNP         BP   A1      F_A      F_U   A2        CHISQ            P           OR
   1      1:3174       3174    T   0.1351   0.1332    C     0.003612       0.9521        1.017
   1     1:32399      32399    T   0.1324   0.1288    C       0.0114        0.915        1.032
复制代码

 

3、--assoc --adjust命令

复制代码
root@PC1:/home/test3# ls
qc3.map  qc3.ped
root@PC1:/home/test3# plink --file qc3 --assoc --adjust --out test   ## 直接使用--assoc  --adjust
PLINK v1.90b6.24 64-bit (6 Jun 2021)           www.cog-genomics.org/plink/1.9/
(C) 2005-2021 Shaun Purcell, Christopher Chang   GNU General Public License v3
Logging to test.log.
Options in effect:
  --adjust
  --assoc
  --file qc3
  --out test

15974 MB RAM detected; reserving 7987 MB for main workspace.
.ped scan complete (for binary autoconversion).
Performing single-pass .bed write (417105 variants, 288 people).
--file: test-temporary.bed + test-temporary.bim + test-temporary.fam written.
417105 variants loaded from .bim file.
288 people (288 males, 0 females) loaded from .fam.
288 phenotype values loaded from .fam.
Using 1 thread (no multithreaded calculations invoked).
Before main variant filters, 288 founders and 0 nonfounders present.
Calculating allele frequencies... done.
Total genotyping rate is 0.948564.
417105 variants and 288 people pass filters and QC.
Among remaining phenotypes, 75 are cases and 213 are controls.
Writing C/C --assoc report to test.assoc ... done.
--adjust: Genomic inflation est. lambda (based on median chisq) = 1.18741.
--adjust values (417105 variants) written to test.assoc.adjusted .
root@PC1:/home/test3# ls
qc3.map  qc3.ped  test.assoc  test.assoc.adjusted  test.log
root@PC1:/home/test3# wc -l *
   417105 qc3.map
      288 qc3.ped
   417106 test.assoc
   417106 test.assoc.adjusted
       30 test.log
  1251635 total
root@PC1:/home/test3# head -n 3 test.assoc    ## 查看结果
 CHR         SNP         BP   A1      F_A      F_U   A2        CHISQ            P           OR
   1      1:3174       3174    T   0.1351   0.1332    C     0.003612       0.9521        1.017
   1     1:32399      32399    T   0.1324   0.1288    C       0.0114        0.915        1.032
root@PC1:/home/test3# head -n 3 test.assoc.adjusted   ## 查看结果
 CHR         SNP      UNADJ         GC       BONF       HOLM   SIDAK_SS   SIDAK_SD     FDR_BH     FDR_BY
  20 20:11254823   7.14e-48  1.357e-40  2.978e-42  2.978e-42        INF        INF  1.489e-42  2.013e-41
  20 20:11269211   7.14e-48  1.357e-40  2.978e-42  2.978e-42        INF        INF  1.489e-42  2.013e-41
复制代码

 

4、--logistic

复制代码
root@PC1:/home/test3# ls
qc3.map  qc3.ped
root@PC1:/home/test3# plink --file qc3 --logistic --out test    ## 直接使用logistic
PLINK v1.90b6.24 64-bit (6 Jun 2021)           www.cog-genomics.org/plink/1.9/
(C) 2005-2021 Shaun Purcell, Christopher Chang   GNU General Public License v3
Logging to test.log.
Options in effect:
  --file qc3
  --logistic
  --out test

15974 MB RAM detected; reserving 7987 MB for main workspace.
.ped scan complete (for binary autoconversion).
Performing single-pass .bed write (417105 variants, 288 people).
--file: test-temporary.bed + test-temporary.bim + test-temporary.fam written.
417105 variants loaded from .bim file.
288 people (288 males, 0 females) loaded from .fam.
288 phenotype values loaded from .fam.
Using 1 thread (no multithreaded calculations invoked).
Before main variant filters, 288 founders and 0 nonfounders present.
Calculating allele frequencies... done.
Total genotyping rate is 0.948564.
417105 variants and 288 people pass filters and QC.
Among remaining phenotypes, 75 are cases and 213 are controls.
Writing logistic model association results to test.assoc.logistic ... done.
root@PC1:/home/test3# ls
qc3.map  qc3.ped  test.assoc.logistic  test.log
root@PC1:/home/test3# wc -l *
   417105 qc3.map
      288 qc3.ped
   417106 test.assoc.logistic
       27 test.log
   834526 total
root@PC1:/home/test3# head test.assoc.logistic   ## 查看结果
 CHR         SNP         BP   A1       TEST    NMISS         OR         STAT            P
   1      1:3174       3174    T        ADD      273      1.019      0.06304       0.9497
   1     1:32399      32399    T        ADD      266      1.029       0.1025       0.9183
   1     1:32402      32402    C        ADD      267     0.9198      -0.4321       0.6657
   1     1:32406      32406    C        ADD      267     0.8815      -0.6626       0.5076
   1     1:32443      32443    C        ADD      268       1.16       0.6713        0.502
   1     1:32548      32548    T        ADD      269      1.038        0.147       0.8831
   1     1:45044      45044    A        ADD      286     0.8389      -0.7204       0.4713
   1     1:45047      45047    T        ADD      283     0.8288      -0.6577       0.5107
   1     1:45085      45085    C        ADD      283      1.007      0.03192       0.9745
复制代码

 

5、--logistic --adjust命令

复制代码
root@PC1:/home/test3# ls
qc3.map  qc3.ped
root@PC1:/home/test3#
root@PC1:/home/test3#
root@PC1:/home/test3# ls
qc3.map  qc3.ped
root@PC1:/home/test3# plink --file qc3 --logistic --adjust --out test   ## 直接运行
PLINK v1.90b6.24 64-bit (6 Jun 2021)           www.cog-genomics.org/plink/1.9/
(C) 2005-2021 Shaun Purcell, Christopher Chang   GNU General Public License v3
Logging to test.log.
Options in effect:
  --adjust
  --file qc3
  --logistic
  --out test

15974 MB RAM detected; reserving 7987 MB for main workspace.
.ped scan complete (for binary autoconversion).
Performing single-pass .bed write (417105 variants, 288 people).
--file: test-temporary.bed + test-temporary.bim + test-temporary.fam written.
417105 variants loaded from .bim file.
288 people (288 males, 0 females) loaded from .fam.
288 phenotype values loaded from .fam.
Using 1 thread (no multithreaded calculations invoked).
Before main variant filters, 288 founders and 0 nonfounders present.
Calculating allele frequencies... done.
Total genotyping rate is 0.948564.
417105 variants and 288 people pass filters and QC.
Among remaining phenotypes, 75 are cases and 213 are controls.
Writing logistic model association results to test.assoc.logistic ... done.
--adjust: Genomic inflation est. lambda (based on median chisq) = 1.12802.
--adjust values (417105 variants) written to test.assoc.logistic.adjusted .
root@PC1:/home/test3# ls  ## 查看生成的结果
qc3.map  qc3.ped  test.assoc.logistic  test.assoc.logistic.adjusted  test.log
root@PC1:/home/test3# wc -l *
   417105 qc3.map
      288 qc3.ped
   417106 test.assoc.logistic
   417106 test.assoc.logistic.adjusted
       30 test.log
  1251635 total
root@PC1:/home/test3# head -n 3 test.assoc.logistic   ## 查看结果
 CHR         SNP         BP   A1       TEST    NMISS         OR         STAT            P
   1      1:3174       3174    T        ADD      273      1.019      0.06304       0.9497
   1     1:32399      32399    T        ADD      266      1.029       0.1025       0.9183
root@PC1:/home/test3# head -n 3 test.assoc.logistic.adjusted    ## 查看结果
 CHR         SNP      UNADJ         GC       BONF       HOLM   SIDAK_SS   SIDAK_SD     FDR_BH     FDR_BY
  20 20:11020547  1.411e-23  4.399e-21  5.887e-18  5.887e-18        INF        INF  5.887e-18  7.959e-17
  20 20:11403719   4.34e-22  9.205e-20   1.81e-16   1.81e-16   2.22e-16   2.22e-16  9.052e-17  1.224e-15
复制代码

 

6、加入协变量, 如何生成协变量

step1:

复制代码
root@PC1:/home/test3# ls
qc3.map  qc3.ped
root@PC1:/home/test3# plink --file qc3 --genome --out cov   ## 直接使用--genome命令
PLINK v1.90b6.24 64-bit (6 Jun 2021)           www.cog-genomics.org/plink/1.9/
(C) 2005-2021 Shaun Purcell, Christopher Chang   GNU General Public License v3
Logging to cov.log.
Options in effect:
  --file qc3
  --genome
  --out cov

15974 MB RAM detected; reserving 7987 MB for main workspace.
.ped scan complete (for binary autoconversion).
Performing single-pass .bed write (417105 variants, 288 people).
--file: cov-temporary.bed + cov-temporary.bim + cov-temporary.fam written.
417105 variants loaded from .bim file.
288 people (288 males, 0 females) loaded from .fam.
288 phenotype values loaded from .fam.
Using up to 8 threads (change this with --threads).
Before main variant filters, 288 founders and 0 nonfounders present.
Calculating allele frequencies... done.
Total genotyping rate is 0.948564.
417105 variants and 288 people pass filters and QC.
Among remaining phenotypes, 75 are cases and 213 are controls.
IBD calculations complete.
Finished writing cov.genome .
root@PC1:/home/test3# ls
cov.genome  cov.log  qc3.map  qc3.ped
复制代码

 

step2:

复制代码
root@PC1:/home/test3# ls
cov.genome  cov.log  qc3.map  qc3.ped
root@PC1:/home/test3# plink --file qc3 --read-genome cov.genome --cluster --mds-plot 10 --out cov2 
PLINK v1.90b6.24 64-bit (6 Jun 2021)           www.cog-genomics.org/plink/1.9/
(C) 2005-2021 Shaun Purcell, Christopher Chang   GNU General Public License v3
Logging to cov2.log.
Options in effect:
  --cluster
  --file qc3
  --mds-plot 10
  --out cov2
  --read-genome cov.genome

15974 MB RAM detected; reserving 7987 MB for main workspace.
.ped scan complete (for binary autoconversion).
Performing single-pass .bed write (417105 variants, 288 people).
--file: cov2-temporary.bed + cov2-temporary.bim + cov2-temporary.fam written.
417105 variants loaded from .bim file.
288 people (288 males, 0 females) loaded from .fam.
288 phenotype values loaded from .fam.
Using 1 thread (no multithreaded calculations invoked).
Before main variant filters, 288 founders and 0 nonfounders present.
Calculating allele frequencies... done.
Total genotyping rate is 0.948564.
417105 variants and 288 people pass filters and QC.
Among remaining phenotypes, 75 are cases and 213 are controls.
Clustering... done.
Cluster solution written to cov2.cluster1 , cov2.cluster2 , and cov2.cluster3 .
Performing multidimensional scaling analysis (SVD algorithm, 10
dimensions)... done.
MDS solution written to cov2.mds .
root@PC1:/home/test3# ls  ## 查看结果 cov2.mds为下一步要使用的结果
cov2.cluster1  cov2.cluster2  cov2.cluster3  cov2.log  cov2.mds  cov.genome  cov.log  qc3.map  qc3.ped
复制代码

 

step3:

复制代码
root@PC1:/home/test3# ls
cov2.cluster1  cov2.cluster2  cov2.cluster3  cov2.log  cov2.mds  cov.genome  cov.log  qc3.map  qc3.ped
root@PC1:/home/test3# awk '{print$1, $2, $4, $5, $6, $7, $8, $9, $10, $11, $12, $13}' cov2.mds > cov3.txt  ## 整理格式
root@PC1:/home/test3# ls
cov2.cluster1  cov2.cluster2  cov2.cluster3  cov2.log  cov2.mds  cov3.txt  cov.genome  cov.log  qc3.map  qc3.ped
root@PC1:/home/test3# wc -l cov3.txt
289 cov3.txt
root@PC1:/home/test3# head -n 3 cov3.txt  ## 查看格式
FID IID C1 C2 C3 C4 C5 C6 C7 C8 C9 C10
1 1 0.0111045 -0.0260147 0.0157171 -0.0128374 -0.00871872 0.015319 -0.0206926 0.0229361 -0.0262635 0.0180982
2 2 -0.00213807 0.0639865 -0.0298421 -0.0296622 0.0138335 0.057727 -0.0341887 -0.0108086 0.0270301 0.0114068
复制代码

 

8、--logistic --covar

复制代码
root@PC1:/home/test3# ls
cov3.txt  qc3.map  qc3.ped
root@PC1:/home/test3# plink --file qc3 --covar cov3.txt --logistic --hide-covar --out test  ##添加协变量
PLINK v1.90b6.24 64-bit (6 Jun 2021)           www.cog-genomics.org/plink/1.9/
(C) 2005-2021 Shaun Purcell, Christopher Chang   GNU General Public License v3
Logging to test.log.
Options in effect:
  --covar cov3.txt
  --file qc3
  --hide-covar
  --logistic
  --out test

Note: --hide-covar flag deprecated.  Use e.g. "--linear hide-covar".
15974 MB RAM detected; reserving 7987 MB for main workspace.
.ped scan complete (for binary autoconversion).
Performing single-pass .bed write (417105 variants, 288 people).
--file: test-temporary.bed + test-temporary.bim + test-temporary.fam written.
417105 variants loaded from .bim file.
288 people (288 males, 0 females) loaded from .fam.
288 phenotype values loaded from .fam.
Using 1 thread (no multithreaded calculations invoked).
--covar: 10 covariates loaded.
Before main variant filters, 288 founders and 0 nonfounders present.
Calculating allele frequencies... done.
Total genotyping rate is 0.948564.
417105 variants and 288 people pass filters and QC.
Among remaining phenotypes, 75 are cases and 213 are controls.
Writing logistic model association results to test.assoc.logistic ... done.
root@PC1:/home/test3# ls  ## 查看生成结果
cov3.txt  qc3.map  qc3.ped  test.assoc.logistic  test.log
root@PC1:/home/test3# wc -l *
      289 cov3.txt
   417105 qc3.map
      288 qc3.ped
   417106 test.assoc.logistic
       31 test.log
   834819 total
root@PC1:/home/test3# head test.assoc.logistic -n 3   ## 查看结果
 CHR         SNP         BP   A1       TEST    NMISS         OR         STAT            P
   1      1:3174       3174    T        ADD      273     0.9015      -0.3232       0.7465
   1     1:32399      32399    T        ADD      266     0.9416      -0.1934       0.8467
复制代码

 

9、--logistic --adjust --covar

复制代码
root@PC1:/home/test3# ls
cov3.txt  qc3.map  qc3.ped
root@PC1:/home/test3# plink --file qc3 --covar cov3.txt --adjust --logistic --hide-covar --out test
PLINK v1.90b6.24 64-bit (6 Jun 2021)           www.cog-genomics.org/plink/1.9/
(C) 2005-2021 Shaun Purcell, Christopher Chang   GNU General Public License v3
Logging to test.log.
Options in effect:
  --adjust
  --covar cov3.txt
  --file qc3
  --hide-covar
  --logistic
  --out test

Note: --hide-covar flag deprecated.  Use e.g. "--linear hide-covar".
15974 MB RAM detected; reserving 7987 MB for main workspace.
.ped scan complete (for binary autoconversion).
Performing single-pass .bed write (417105 variants, 288 people).
--file: test-temporary.bed + test-temporary.bim + test-temporary.fam written.
417105 variants loaded from .bim file.
288 people (288 males, 0 females) loaded from .fam.
288 phenotype values loaded from .fam.
Using 1 thread (no multithreaded calculations invoked).
--covar: 10 covariates loaded.
Before main variant filters, 288 founders and 0 nonfounders present.
Calculating allele frequencies... done.
Total genotyping rate is 0.948564.
417105 variants and 288 people pass filters and QC.
Among remaining phenotypes, 75 are cases and 213 are controls.
Writing logistic model association results to test.assoc.logistic ... done.
--adjust: Genomic inflation est. lambda (based on median chisq) = 1.17784.
--adjust values (417105 variants) written to test.assoc.logistic.adjusted .
root@PC1:/home/test3# ls   ### 查看生成结果
cov3.txt  qc3.map  qc3.ped  test.assoc.logistic  test.assoc.logistic.adjusted  test.log
root@PC1:/home/test3# wc -l *
      289 cov3.txt
   417105 qc3.map
      288 qc3.ped
   417106 test.assoc.logistic
   417106 test.assoc.logistic.adjusted
       34 test.log
  1251928 total
root@PC1:/home/test3# head test.assoc.logistic -n 3  ## 查看结果
 CHR         SNP         BP   A1       TEST    NMISS         OR         STAT            P
   1      1:3174       3174    T        ADD      273     0.9015      -0.3232       0.7465
   1     1:32399      32399    T        ADD      266     0.9416      -0.1934       0.8467
root@PC1:/home/test3# head test.assoc.logistic.adjusted -n 3  ## 查看结果
 CHR         SNP      UNADJ         GC       BONF       HOLM   SIDAK_SS   SIDAK_SD     FDR_BH     FDR_BY
  20 20:11403719  7.848e-19  3.194e-16  3.274e-13  3.274e-13  3.274e-13  3.274e-13  3.274e-13  4.425e-12
  20 20:11020547  2.313e-18  8.011e-16  9.646e-13  9.646e-13  9.646e-13  9.646e-13   3.76e-13  5.082e-12
复制代码

 

参考:

(1)https://github.com/MareesAT/GWA_tutorial/

(2)https://blog.csdn.net/yijiaobani/article/details/105294261

 

posted @   小鲨鱼2018  阅读(2781)  评论(0编辑  收藏  举报
编辑推荐:
· 基于Microsoft.Extensions.AI核心库实现RAG应用
· Linux系列:如何用heaptrack跟踪.NET程序的非托管内存泄露
· 开发者必知的日志记录最佳实践
· SQL Server 2025 AI相关能力初探
· Linux系列:如何用 C#调用 C方法造成内存泄露
阅读排行:
· 震惊!C++程序真的从main开始吗?99%的程序员都答错了
· 【硬核科普】Trae如何「偷看」你的代码?零基础破解AI编程运行原理
· 单元测试从入门到精通
· 上周热点回顾(3.3-3.9)
· winform 绘制太阳,地球,月球 运作规律
历史上的今天:
2020-12-17 linux系统中部署apache服务(虚拟主机功能,基于端口号部署多个网站)
2020-12-17 linux系统中部署apache服务(基于主机域名部署多个网站)
2020-12-17 linux系统中部署apache虚拟主机功能(基于IP地址部署多个网站)
2020-12-17 python中逻辑操作符
2020-12-17 python中比较操作符
2020-12-17 python中变量类型转换
2020-12-17 python中长字符串
点击右上角即可分享
微信分享提示