causal snps | causal variants | tensorflow | 神经网络实战 | Data Simulation
先读几篇文章:
GWAS have been successful in identifying disease susceptibility loci, but it remains a challenge to pinpoint the causal variants in subsequent fine-mapping studies. A conventional fine-mapping effort starts by sequencing dozens of randomly selected samples at susceptibility loci to discover candidate variants, which are then placed on custom arrays or used in imputation algorithms to find the causal variants. We propose that one or several rare or low-frequency causal variants can hitchhike the same common tag SNP, so causal variants may not be easily unveiled by conventional efforts. Here, we first demonstrate that the true effect size and proportion of variance explained by a collection of rare causal variants can be underestimated by a common tag SNP, thereby accounting for some of the “missing heritability” in GWAS. We then describe a case-selection approach based on phasing long-range haplotypes and sequencing cases predicted to harbor causal variants. We compare this approach with conventional strategies on a simulated data set, and we demonstrate its advantages when multiple causal variants are present. We also evaluate this approach in a GWAS on hearing loss, where the most common causal variant has a minor allele frequency (MAF) of 1.3% in the general population and 8.2% in 329 cases. With our case-selection approach, it is present in 88% of the 32 selected cases (MAF = 66%), so sequencing a subset of these cases can readily reveal the causal allele. Our results suggest that thinking beyond common variants is essential in interpreting GWAS signals and identifying causal variants.
Identification of causal genes for complex traits
初步学习一些TensorFlow的基本概念
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 | # View more python tutorial on my Youtube and Youku channel!!! # Youtube video tutorial: https://www.youtube.com/channel/UCdyjiB5H8Pu7aDTNVXTTpcg # Youku video tutorial: http://i.youku.com/pythontutorial """ Please note, this code is only for python 3+. If you are using python 2+, please modify the code accordingly. """ from __future__ import print_function import tensorflow as tf import numpy as np # create data x_data = np.random.rand( 100 ).astype(np.float32) y_data = x_data * 0.1 + 0.3 ### create tensorflow structure start ### Weights = tf.Variable(tf.random_uniform([ 1 ], - 1.0 , 1.0 )) biases = tf.Variable(tf.zeros([ 1 ])) y = Weights * x_data + biases loss = tf.reduce_mean(tf.square(y - y_data)) optimizer = tf.train.GradientDescentOptimizer( 0.5 ) train = optimizer.minimize(loss) ### create tensorflow structure end ### sess = tf.Session() # tf.initialize_all_variables() no long valid from # 2017-03-02 if using tensorflow >= 0.12 if int ((tf.__version__).split( '.' )[ 1 ]) < 12 and int ((tf.__version__).split( '.' )[ 0 ]) < 1 : init = tf.initialize_all_variables() else : init = tf.global_variables_initializer() sess.run(init) for step in range ( 201 ): sess.run(train) if step % 20 = = 0 : print (step, sess.run(Weights), sess.run(biases)) |
如何制作模拟的数据
Data Simulation Software for Whole-Genome Association and Other Studies in Human Genetics
A comparison of tools for the simulation of genomic next-generation sequencing data
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 | num_cau_SNP <- 20 num_SNP <- 500 samplesize <- 20 h_squared <- 0.5 # generate genotype in Binomial distribution pj <- runif (num_SNP, 0.01, 0.5) xij_star <- matrix (0, samplesize, num_SNP) #for every SNP for (j in 1: num_SNP) { xij_star[,j] <- rbinom (samplesize, 2, pj[j]) } #position of causal SNPs CauSNP <- sample (1:num_SNP, num_cau_SNP, replace = F) Ord_CauSNP <- sort (CauSNP, decreasing = F) # generate beta, which is the best predictor beta <- rep (0,num_SNP) dim (beta) <- c (num_SNP,1) # non-null betas follow standard normal distribution beta[Ord_CauSNP] <- rnorm (num_cau_SNP,0,1) # epsilon var_e <- sum ((xij_star %*% beta)^2) # var_e <- t(beta)%*%t(xij_star)%*%xij_star%*%beta/samplesize*(1-h_squared)/h_squared e <- rnorm (samplesize, 0, sqrt (var_e)) dim (e) <- c (samplesize, 1) # generate phenotype pheno <- xij_star %*% beta + e # scale(genotype matrix) |
待续~
【推荐】编程新体验,更懂你的AI,立即体验豆包MarsCode编程助手
【推荐】凌霞软件回馈社区,博客园 & 1Panel & Halo 联合会员上线
【推荐】抖音旗下AI助手豆包,你的智能百科全书,全免费不限次数
【推荐】博客园社区专享云产品让利特惠,阿里云新客6.5折上折
【推荐】轻量又高性能的 SSH 工具 IShell:AI 加持,快人一步
· Java 中堆内存和栈内存上的数据分布和特点
· 开发中对象命名的一点思考
· .NET Core内存结构体系(Windows环境)底层原理浅谈
· C# 深度学习:对抗生成网络(GAN)训练头像生成模型
· .NET 适配 HarmonyOS 进展
· 如何给本地部署的DeepSeek投喂数据,让他更懂你
· 超详细,DeepSeek 接入PyCharm实现AI编程!(支持本地部署DeepSeek及官方Dee
· 用 DeepSeek 给对象做个网站,她一定感动坏了
· .NET 8.0 + Linux 香橙派,实现高效的 IoT 数据采集与控制解决方案
· .NET中 泛型 + 依赖注入 的实现与应用