生信工具推荐之(2) datasets

datasets NCBI出品跨平台轻松批量从数据库中下载数据的命令行工具

指南:

工具处于快速更新迭代阶段,正逐步添加新功能,,参考网址:https://www.ncbi.nlm.nih.gov/datasets/docs/v1/how-tos/

安装:
curl -o datasets 'https://ftp.ncbi.nlm.nih.gov/pub/datasets/command-line/v1/linux-amd64/datasets'

curl -o dataformat 'https://ftp.ncbi.nlm.nih.gov/pub/datasets/command-line/v1/linux-amd64/dataformat'

chmod +x dataformat datasets
用法:

比如下载基因,基因组序列数的据帮助信息如下

./datasets download -h

Download genome, gene and coronavirus data packages, including sequence, annotation, and metadata, as a zip file.

Refer to NCBI's [download and install](https://www.ncbi.nlm.nih.gov/datasets/docs/download-and-install/) documentation for information about getting started with the command-line tools.

Usage
  datasets download [flags]
  datasets download [command]

Examples
  datasets download genome accession GCF_000001405.40 --chromosomes X,Y --exclude-gff3 --exclude-rna
  datasets download genome taxon "bos taurus"
  datasets download gene gene-id 672
  datasets download gene symbol brca1 --taxon mouse
  datasets download gene accession NP_000483.3
  datasets download virus genome taxon sars-cov-2 --host dog
  datasets download virus protein S --host dog --filename SARS2-spike-dog.zip
  datasets download --input-json request_file.json --filename output.zip

Available Commands
  gene        download a gene dataset
  genome      download a genome dataset
  virus       download a coronavirus dataset
  ortholog    download an ortholog dataset

Flags
      --filename string     specify a custom file name for the downloaded dataset (default "ncbi_dataset.zip")
  -h, --help                help for download
      --input-json string   a file that contains a valid json request object for genome or gene queries


Global Flags
      --api-key string   NCBI Datasets API Key
      --no-progressbar   hide progress bar

Use datasets download help <command> for detailed help about a command.
测试:

下载人类基因id为672的序列,其他更多用法参考:https://www.ncbi.nlm.nih.gov/datasets/docs/v2/how-tos/genomes/download-genome/

datasets download gene gene-id 672 
Downloading: ncbi_dataset.zip    818kB done
# 解压结果
unzip ncbi_dataset.zip 
Archive:  ncbi_dataset.zip
  inflating: README.md               
  inflating: ncbi_dataset/data/gene.fna  
  inflating: ncbi_dataset/data/rna.fna  
  inflating: ncbi_dataset/data/protein.faa  
  inflating: ncbi_dataset/data/data_report.jsonl  
  inflating: ncbi_dataset/data/data_table.tsv  
  inflating: ncbi_dataset/data/dataset_catalog.json  
posted @ 2023-04-19 10:50  天使不设防  阅读(186)  评论(0编辑  收藏  举报