sqoop

把mysql中的表复制到hdfs/hive中，hdfs默认路径是/user/(username)中

sqoop ##sqoop命令

import ##表示导入

--connect jdbc:mysql://ip:3306/sqoop ##告诉jdbc，连接mysql的url

--username root ##连接mysql的用户名

--password ad min ##连接mysql的密码

--table mysql1 ##从mysql导出的表名称

--fields-terminated-by '\t' ##指定输出文件中的行的字段分隔符

-m 1 ##复制过程使用1个map作业

--hive-import ##把mysql表数据复制到hive空间中。如果不使用该选项，意味着复制到hdfs中

选择增加内容

--append --hive-import

--check-column 'TBL_ID' 选择要导入的列Specifies the column to be examined when determining which rows to import.

--incremental append 以哪种方式检查 append 或者lastmodified。

Specifies how Sqoop determines which rows are new. Legal values for mode include append and lastmodified.

--last-value 6 检查的范围，从之前的导入的最大值 Specifies the maximum value of the check column from the previous import.

把hive中的表数据复制到mysql中

sqoop

export ##表示数据从hive复制到mysql中

--connect jdbc:mysql://ip:3306/sqoop

--username root

--password admin

--table mysql2 ##mysql中的表，即将被导入的表名称

--export-dir '/user/root/warehouse/mysql1' ##hive中被导出的文件目录

--fields-terminated-by '\t' ##hive中被导出的文件字段的分隔符

注意：mysql2必须存在

sqoop job --create myjob -- import --connect jdbc:mysql://master.hadoop:3306/hive --username root --password admin --table TBLS --fields-terminated-by '\t' --null-string '**' -m 1 --append --hive-import

4. 导入导出的事务是以Mapper任务为单位。

posted on 2015-03-14 10:42 咖啡猫1292 阅读(296) 评论(0) 编辑收藏举报

刷新页面返回顶部

咖啡猫

导航

sqoop