R语言模拟出随机ped和map文件

 

1、方法1

模拟200个样本, 50000个位点

复制代码
nsnp <- 50000
nind <- 200
nums <- sample(1:2, nsnp * 2 * nind, replace = T)
snp_matrix <- matrix(nums, nrow = 200)

col_idx <- matrix(1:100000, ncol = 2, byrow = T)

dim(snp_matrix)

base <- sample(c("A", "T", "C", "G"), 2)

for (i in 1:nrow(col_idx)) {
  base <- sample(c("A", "T", "C", "G"), 2)
  snp_matrix[,col_idx[i,]] <- replace(snp_matrix[,col_idx[i,]], which(snp_matrix[,col_idx[i,]] == 1),base[1])
  snp_matrix[,col_idx[i,]] <- replace(snp_matrix[,col_idx[i,]], which(snp_matrix[,col_idx[i,]] == 2),base[2])
  }

fid <- rep("pop1", nind)
iid <- paste0("iid", 1:nind)
pid <- rep(0, nind)
mid <- pid
sex <- sample(1:2, nind, replace = T)
phy <- sample(1:2, nind, replace = T)

resultped <- cbind(fid, iid, pid, mid, sex, phy, snp_matrix)

chr <- rep(1, nsnp)
dis <- rep(0, nsnp)
pos <- sort(sample(1:200000000, nsnp))
snpid <- paste0(chr,":", pos)

resultmap <- cbind(chr, snpid, dis, pos)

write.table(resultped, "result.ped", row.names = F, col.names = F, quote = F, sep = "\t")
write.table(resultmap, "result.map", row.names = F, col.names = F, quote = F, sep = "\t")
复制代码

 

 

2、方法2

复制代码
nsnp = 50000
nind = 200

result <- matrix(nrow = nind, ncol = nsnp * 2)

for(i in 1:nsnp){
  base <- sample(c("A", "T", "C", "G"),2)
  ind_snp <- sample(base,nind * 2, replace = T)
  result[,c(i * 2 - 1, i * 2)] <- ind_snp
}
fid <- rep("pop2", nind)
iid <- paste0("iid", 1:nind)
pat <- rep(0, nind)
mat <- pat
sex <- sample(1:2, nind, replace = T)
phy <- sex

resultped <- cbind(fid, iid, pat, mat, sex, phy, result)
dim(resultped)

chr <- rep(1, nsnp)
dis <- rep(0, nsnp)
pos <- sort(sample(1:200000000, nsnp))
snpid <- paste0(chr, ":", pos)

resultmap <- cbind(chr, snpid, dis, pos)

write.table(resultped, "result.ped", row.names = F, col.names = F, quote = F, sep = "\t")
write.table(resultmap, "result.map", row.names = F, col.names = F, quote = F, sep = "\t")
复制代码

 

plink验证:

 

posted @   小鲨鱼2018  阅读(256)  评论(0编辑  收藏  举报
相关博文:
阅读排行:
· 震惊!C++程序真的从main开始吗?99%的程序员都答错了
· 【硬核科普】Trae如何「偷看」你的代码?零基础破解AI编程运行原理
· 单元测试从入门到精通
· 上周热点回顾(3.3-3.9)
· winform 绘制太阳,地球,月球 运作规律
历史上的今天:
2021-04-29 c语言 5-9
2021-04-29 c语言 5-8
2021-04-29 c语言5-7
2021-04-29 c语言5-7
2021-04-29 c语言5-5 在应用对象式宏的数组中对数组元素进行倒序排列
2021-04-29 c语言5-4将数组a的元素倒序复制到数组b中
2021-04-29 c语言中设置数组元素的个数
点击右上角即可分享
微信分享提示