大批量数据导入(Bulk Data Loading)
PSQL
.csv文件中,没有引号;
直接写值,不管是啥数据类型;
存在表,直接导入数据;
bin/psql.py -t EXAMPLE localhost data.csv
建表,导数据
./psql.py localhost:2222 XXX.sql XXX.csv
MapReduce
etc/profile
export HADOOP_CLASSPATH=$HADOOP_CLASSPATH:/CI-zcl/hbase-0.98.6.1-hadoop2/lib/hbase-protocol-0.98.6.1-hadoop2.jar
报错
17/02/14 16:50:08 WARN zookeeper.ClientCnxn: Session 0x0 for server null, unexpected error, closing socket connection and attempting reconnect
java.net.ConnectException: 拒绝连接
at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
at sun.nio.ch.SocketChannelImpl.finishConnect(Unknown Source)
at org.apache.zookeeper.ClientCnxnSocketNIO.doTransport(ClientCnxnSocketNIO.java:350)
at org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:1068)
17/02/14 16:50:08 WARN zookeeper.RecoverableZooKeeper: Possibly transient ZooKeeper, quorum=master:2181, exception=org.apache.zookeeper.KeeperException$ConnectionLossException: KeeperErrorCode = ConnectionLoss for /hbase/hbaseid
17/02/14 16:50:08 INFO util.RetryCounter: Sleeping 1000ms before retry #0...
17/02/14 16:50:09 INFO zookeeper.ClientCnxn: Opening socket connection to server master/10.2.32.22:2181. Will not attempt to authenticate using SASL (unknown error)
17/02/14 16:50:09 WARN zookeeper.ClientCnxn: Session 0x0 for server null, unexpected error, closing socket connection and attempting reconnect
原因及解决
看日志,很重要
端口改回2181就好使了
在PhoenixHome里面执行:
hadoop jar /CI-zcl/phoenix-4.1.0-bin/phoenix-4.1.0-client-hadoop2.jar org.apache.phoenix.mapreduce.CsvBulkLoadTool -t TEST -i /data/tb1.csv -z master:2181