sqoop组件的安装与使用
一、sqoop安装
安装sqoop的前提是已经具备java和hadoop的环境
1、下载并解压
最新版下载地址http://ftp.wayne.edu/apache/sqoop/1.4.6/
2、修改配置文件
1 2 3 | $ cd sqoop/conf(默认将sqoop解压到当前目录下) $ mv sqoop-env-template.sh sqoop-env.sh |
打开sqoop-env.sh并编辑下面几行:
export HADOOP_COMMON_HOME=/home/hadoop/apps/hadoop-2.6.4/(可用which (hadoop)命令查看位置) export HADOOP_MAPRED_HOME=/home/hadoop/apps/hadoop-2.6.4/ export HIVE_HOME=/home/hadoop/apps/hive
3、加入mysql的jdbc驱动包
cp ~/apps/hive/lib/mysql-connector-java-5.1.28.jar /sqoop/lib/(在之前的hive安装中已经导入过mysql驱动包)
若无,则可以将驱动包上传至虚拟机,拷入至sqoop/lib中
4、验证启动
$ cd sqoop/bin
$ sqoop-version
预期的输出:
15/12/17 14:52:32 INFO sqoop.Sqoop: Running Sqoop version: 1.4.6
Sqoop 1.4.6 git commit id 5b34accaca7de251fc91161733f906af2eddbe83
Compiled by abe on Fri Aug 1 11:19:26 PDT 2015
到这里,整个Sqoop安装工作完成。
5、使用
cd sqoop/
使用bin/sqoop help 可查看其中操作指令
Available commands: codegen Generate code to interact with database records create-hive-table Import a table definition into Hive eval Evaluate a SQL statement and display the results export Export an HDFS directory to a database table help List available commands import Import a table from a database to HDFS import-all-tables Import tables from a database to HDFS import-mainframe Import datasets from a mainframe server to HDFS job Work with saved jobs list-databases List available databases on a server list-tables List available tables in a database merge Merge results of incremental imports metastore Run a standalone Sqoop metastore version Display version information
举例:将mysql数据库中某表数据导入HDFS中,由于前边的hadoop配置,最终数据会导向HDFS中的/usr/hadoop/文件夹中:
$bin/sqoop import \ --connect jdbc:mysql://min1:3306/mysql \ 数据库链接信息 --username root \ --password 123456 \ --table db \ 选择导出哪张表数据 --m 1
系统会运行mapreduce程序,在min1:8088以及min1:50070中可看到运行过程,如果成功执行,那么会得到下面的输出。
9/03/18 16:53:53 INFO mapreduce.JobSubmitter: Submitting tokens for job: job_1552575186473_0030 19/03/18 16:53:54 INFO impl.YarnClientImpl: Submitted application application_1552575186473_0030 19/03/18 16:53:54 INFO mapreduce.Job: The url to track the job: http://min1:8088/proxy/application_1552575186473_0030/ 19/03/18 16:53:54 INFO mapreduce.Job: Running job: job_1552575186473_0030 19/03/18 16:54:11 INFO mapreduce.Job: Job job_1552575186473_0030 running in uber mode : false 19/03/18 16:54:11 INFO mapreduce.Job: map 0% reduce 0% 19/03/18 16:54:32 INFO mapreduce.Job: map 100% reduce 0% 19/03/18 16:54:33 INFO mapreduce.Job: Job job_1552575186473_0030 completed successfully 19/03/18 16:54:33 INFO mapreduce.Job: Counters: 30 File System Counters FILE: Number of bytes read=0 FILE: Number of bytes written=124571 FILE: Number of read operations=0 FILE: Number of large read operations=0 FILE: Number of write operations=0 HDFS: Number of bytes read=87 HDFS: Number of bytes written=95 HDFS: Number of read operations=4 HDFS: Number of large read operations=0 HDFS: Number of write operations=2 Job Counters Launched map tasks=1 Other local map tasks=1 Total time spent by all maps in occupied slots (ms)=16705 Total time spent by all reduces in occupied slots (ms)=0 Total time spent by all map tasks (ms)=16705 Total vcore-milliseconds taken by all map tasks=16705 Total megabyte-milliseconds taken by all map tasks=17105920 Map-Reduce Framework Map input records=2 Map output records=2 Input split bytes=87 Spilled Records=0 Failed Shuffles=0 Merged Map outputs=0 GC time elapsed (ms)=110 CPU time spent (ms)=2980 Physical memory (bytes) snapshot=103735296 Virtual memory (bytes) snapshot=2064986112 Total committed heap usage (bytes)=30474240 File Input Format Counters Bytes Read=0 File Output Format Counters Bytes Written=95 19/03/18 16:54:33 INFO mapreduce.ImportJobBase: Transferred 95 bytes in 52.9843 seconds (1.793 bytes/sec) 19/03/18 16:54:33 INFO mapreduce.ImportJobBase: Retrieved 2 records.
【推荐】还在用 ECharts 开发大屏?试试这款永久免费的开源 BI 工具!
【推荐】编程新体验,更懂你的AI,立即体验豆包MarsCode编程助手
【推荐】轻量又高性能的 SSH 工具 IShell:AI 加持,快人一步
· 后端思维之高并发处理方案
· 理解Rust引用及其生命周期标识(下)
· 从二进制到误差:逐行拆解C语言浮点运算中的4008175468544之谜
· .NET制作智能桌面机器人:结合BotSharp智能体框架开发语音交互
· 软件产品开发中常见的10个问题及处理方法
· 2025成都.NET开发者Connect圆满结束
· 后端思维之高并发处理方案
· 千万级大表的优化技巧
· 在 VS Code 中,一键安装 MCP Server!
· 10年+ .NET Coder 心语 ── 继承的思维:从思维模式到架构设计的深度解析