Sqoop2入门之导入关系型数据库数据到HDFS上

需求：将hive数据库中的TBLS表导出到HDFS之上；

 $SQOOP2_HOME/bin/sqoop.sh client

sqoop:000> set server --host hadoop000 --port 12000 --webapp sqoop
Server is set successfully

创建connection：

sqoop:000> create connection --cid 1
Creating connection for connector with id 1
Please fill following values to create new connection object
Name: TBLS_IMPORT_DEMO
Connection configuration
JDBC Driver Class: com.mysql.jdbc.Driver
JDBC Connection String: jdbc:mysql://hadoop000:3306/hive
Username: root
Password: ****
JDBC Connection Properties:
There are currently 0 values in the map:
entry#
Security related configuration options
Max connections: 10
New connection was successfully created with validation status FINE and persistent id 10

创建job：

sqoop:000> create job --xid 10 --type import
Creating job for connection with id 10
Please fill following values to create new job object
Name: tbls_import
Database configuration
Schema name: hive
Table name: TBLS
Table SQL statement:
Table column names:
Partition column name:
Nulls in partition column:
Boundary query:
Output configuration
Storage type:
  0 : HDFS
Choose: 0
Output format:
  0 : TEXT_FILE
  1 : SEQUENCE_FILE
Choose: 0
Compression format:
  0 : NONE
  1 : DEFAULT
  2 : DEFLATE
  3 : GZIP
  4 : BZIP2
  5 : LZO
  6 : LZ4
  7 : SNAPPY
Choose: 0
Output directory: hdfs://hadoop000:8020/sqoop2/tbls_import_demo
Throttling resources
Extractors:
Loaders:
New job was successfully created with validation status FINE  and persistent id 6

提交job：

start job --jid 6

查看job执行状态：

status job --jid 6

操作成功后查看HDFS上的文件

hadoop fs -ls hdfs://hadoop000:8020/sqoop2/tbls_import_demo

posted on 2015-01-07 17:57 瞌睡中的葡萄虎阅读(2138) 评论(0) 收藏举报

刷新页面返回顶部

瞌睡中的葡萄虎

公告