sqoop 抽取 postgres 数据库的数据
1.下载驱动:
wget http://jdbc.postgresql.org/download/postgresql-9.2-1002.jdbc4.jar
2.放到sqoop的lib下
cp postgresql-9.2-1002.jdbc4.jar /opt/cloudera/parcels/CDH/lib/sqoop/lib/
3.测试
sqoop list-tables --connect jdbc:postgresql://xx.xx.xxx.xxx/tagdb --username postgresql --P postgresql
4.进行抽数
sqoop import --connect "jdbc:postgresql://xx.xx.xxx.xxx:5432/postdb" --username="postgresql" --password="postgresql" --outdir /tmp/.sqoop/java/ -m 1 --hive-import --hive-overwrite --target-dir /user/temp/ --hive-table "table_test" --hive-partition-key "dt" --hive-partition-value "20200101" --query "select * from table_test where 1=1 AND \$CONDITIONS" --hive-drop-import-delims --fields-terminated-by "^" --lines-terminated-by "\n" --null-string "\\\N" --null-non-string "\\\N"
查询postgres 的表字段SQL:
SELECT a.attnum, a.attname AS field, t.typname AS type, a.attlen AS length, a.atttypmod AS lengthvar, a.attnotnull AS notnull, b.description AS comment FROM pg_class c, pg_attribute a LEFT OUTER JOIN pg_description b ON a.attrelid=b.objoid AND a.attnum = b.objsubid, pg_type t WHERE c.relname = 'tab_name' --表名称 and a.attnum > 0 and a.attrelid = c.oid and a.atttypid = t.oid ORDER BY a.attnum;