根据基因名批量查询下载PDB蛋白结构数据库
下载Using EntrezDirect as noted to get structure accessions
https://www.ncbi.nlm.nih.gov/books/NBK179288/
for i in `cat all.epi.regulators.txt` do echo $i esearch -db structure -query "$i [GENE]" | esummary | xtract -pattern DocumentSummary -element PdbAcc,ExpMethod,Resolution,PdbClass,PdbDepositDate,PdbDescr,string > $i.pdb_id done
偶尔会有网络连接问题
最后再下一遍
ls *.pdb_id | sed "s/.pdb_id//" > downloaded.list grep -vxFf downloaded.list all.epi.regulators.txt
参考:
- http://localhost:17449/lab/tree/projects/LiLab/selfDB/Drug-DB/Drug_to_gene.ipynb
- ~/projects/LiLab/selfDB/rcsb_PDB/
- https://www.biostars.org/p/9576260/