pg_bulkload使用记录

很久之前就使用过pg_bulkload来导入数据了,并做了对比试验,现在另一个项目又需要用了,这里做个记录:

 

1.rpm包比较老,下下来之后发现只支持到pg94,目前我用的是pg10,因此放弃。

 

2.下载源码安装:

git clone https://github.com/ossc-db/pg_bulkload.git

cd pg_bulkload

make && make install

--这里他会读取pg_config来获取pg的环境变量。

 

3.在要使用的数据库中执行:

create extension pg_bulkload;

 

4.导入csv文件:

pg_bulkload -i c_xxx.csv -O c_xxx -l c_xxx_load.log -d xxx -o "TYPE=CSV" -o "WRITER=PARALLEL"

 

5.导入压缩文件:

zcat c_xxx.gz |pg_bulkload -i stdin -O c_xxx -l c_xxx_load.log -d xxx -o "TYPE=CSV" -o "WRITER=PARALLEL"

 

6.关于-o的选项在help中没有,我们可以通过导入的log来看有哪些参数可以配置:

pg_bulkload 3.1.14 on 2018-09-28 11:31:12.641693+08

INPUT = stdin
PARSE_BADFILE = /var/lib/pgsql/pg10/data/pg_bulkload/20180928113112_sgdw_public_c_xxx.prs
LOGFILE = /var/lib/pgsql/sgdw/data/c_xxx_load.log
LIMIT = INFINITE
PARSE_ERRORS = 0
ENCODING = UTF8
CHECK_CONSTRAINTS = NO
TYPE = CSV
SKIP = 0
DELIMITER = ,
QUOTE = "\""
ESCAPE = "\""
NULL =
OUTPUT = public.c_xxx
MULTI_PROCESS = YES
VERBOSE = NO
WRITER = DIRECT
DUPLICATE_BADFILE = /var/lib/pgsql/pg10/data/pg_bulkload/20180928113112_sgdw_public_c_xxx.dup.csv
DUPLICATE_ERRORS = 0
ON_DUPLICATE_KEEP = NEW
TRUNCATE = YES


  0 Rows skipped.
  29423400 Rows successfully loaded.
  0 Rows not loaded due to parse errors.
  0 Rows not loaded due to duplicate errors.
  0 Rows replaced with new rows.

Run began on 2018-09-28 11:31:12.641693+08
Run ended on 2018-09-28 11:39:48.835205+08

CPU 2.63s/399.05u sec elapsed 516.19 sec

 

理论上黑体的都是可以配置的,比如配置为verbose为yes,那就在后面加一个-o "verbose=yes"

 

另外:默认逗号分隔,双引号将值括起来,默认直接写。如果忘记了,就导一个默认的,看看log就知道了。

 

附一个批量的脚本:

 1 -bash-4.1$ cat load.sh
 2 #!/bin/sh
 3 
 4 #$1 data fil ename
 5 
 6 file=$1
 7 
 8 if [ ! -f $file  ]
 9 then
10     echo "File is not exist"
11     exit 1
12 fi
13 
14 echo "-----------------------------------------------------------------"
15 
16 tbname=$( echo $file |cut -d . -f1 )
17 echo "Table name is : "$tbname
18 
19 zcat $file|pg_bulkload -i stdin -O public.$tbname -l $tbname.log -o "TYPE=CSV" -o "WRITER=PARALLEL" -d sgdw
20 
21 echo "load complete"
22 echo "-----------------------------------------------------------------"
View Code

 

posted @ 2018-09-28 13:04  狂神314  阅读(961)  评论(0编辑  收藏  举报