hive之编译源码支持UDF函数
下载hive源码
[root@hadoop001 ~]# cd /opt [root@hadoop001 opt]# mkdir sourcecode [root@hadoop001 opt]# cd sourcecode [root@hadoop001 sourcecode]# wget http://archive.cloudera.com/cdh5/cdh/5/hive-1.1.0-cdh5.7.0-src.tar.gz [root@hadoop001 sourcecode]# ll -rw-r--r-- 1 root root 14652104 Apr 21 10:23 hive-1.1.0-cdh5.7.0-src.tar.gz
解压源码
[root@hadoop001 sourcecode]#tar -xzf hive-1.1.0-cdh5.7.0-src.tar.gz [root@hadoop001 sourcecode]# ll total 14316 drwxrwxr-x 31 root root 4096 Mar 24 2016 hive-1.1.0-cdh5.7.0 -rw-r--r-- 1 root root 14652104 Apr 21 10:23 hive-1.1.0-cdh5.7.0-src.tar.gz
添加UDF函数类
HelloUDF.java
[root@hadoop001 udf]# pwd /opt/sourcecode/hive-1.1.0-cdh5.7.0/ql/src/java/org/apache/hadoop/hive/ql/udf [root@hadoop001 udf]# rz ##上传你自己写的UDF函数
[root@hadoop001 udf]# vim HelloUDF.java
第一行改为:该类的包名为package org.apache.hadoop.hive.ql.udf;
【org/apache/hadoop/hive//ql/udf,这个包名就是HelloUDF.java所在路径】
注册函数
[root@hadoop001 exec]# pwd /opt/sourcecode/hive-1.1.0-cdh5.7.0/ql/src/java/org/apache/hadoop/hive/ql/exec/ [root@hadoop001 exec]# vim FunctionRegistry.java
在第135行添加
import org.apache.hadoop.hive.ql.udf.HelloUDF;
在176行添加
system.registerUDF("HelloUDF", HelloUDF.class,false);
###HelloUDF是函数名,随意起,第二个HelloUDF.class是类的名字
编译hive
[root@hadoop001 exec]# pwd /opt/sourcecode/hive-1.1.0-cdh5.7.0 [root@hadoop001 hive-1.1.0-cdh5.7.0]#mvn clean package -DskipTests -Phadoop-2 -Pdist
- 等待编译成功,或者各种报错,基本上就是配置文件的问题,我报错报了两天,真的心累,总结一下心得给大家
- 1.查看一下maven的版本,最好用最新的,我现在最新的是apache-maven-3.6.1,用apache-maven-3.3.9的时候,编译不成功,会报错。
- 2.换了版本以后看一下环境是否也配置了,如果还沿用以前的环境,会报错
- 3.局部环境和全局环境要保持统一或者只配局部环境,要不然会报错
- 4.setting文件配置###你可以把你之前的备份好,然后全部删掉,把以下内容复制进去
<repositories> <!-- This needs to be removed before checking in--> <repository> <id>alimaven</id> <name>aliyun maven</name> <url>http://maven.aliyun.com/nexus/content/groups/public/</url> <releases> <enabled>true</enabled> </releases> <snapshots> <enabled>false</enabled> </snapshots> </repository> <repository> <id>cdh.releases.repo</id> <url>https://repository.cloudera.com/content/groups/cdh-releases-rcs</url> <name>CDH Releases Repository</name> <snapshots> <enabled>false</enabled> </snapshots> </repository> <repository> <id>cdh.snapshots.repo</id> <url>https://repository.cloudera.com/content/repositories/snapshots</url> <name>CDH Snapshots Repository</name> <snapshots> <enabled>true</enabled> </snapshots> </repository> <repository> <id>datanucleus</id> <name>datanucleus maven repository</name> <url>http://www.datanucleus.org/downloads/maven2</url> <layout>default</layout> <releases> <enabled>true</enabled> <checksumPolicy>warn</checksumPolicy> </releases> <snapshots> <enabled>false</enabled> </snapshots> </repository> <repository> <id>glassfish-repository</id> <url>http://maven.glassfish.org/content/groups/glassfish</url> <releases> <enabled>false</enabled> </releases> <snapshots> <enabled>false</enabled> </snapshots> </repository> <repository> <id>glassfish-repo-archive</id> <url>http://maven.glassfish.org/content/groups/glassfish</url> <releases> <enabled>false</enabled> </releases> <snapshots> <enabled>false</enabled> </snapshots> </repository> <repository> <id>sonatype-snapshot</id> <url>https://oss.sonatype.org/content/repositories/snapshots</url> <releases> <enabled>false</enabled> </releases> <snapshots> <enabled>false</enabled> </snapshots> </repository> </repositories>
编译成功
【注意,编译途中可能会出现这种情况,不要担心,继续等待即可】
apache-hive-1.1.0-cdh5.7.0-bin.tar.gz这个包是我们需要的
[root@hadoop001 target]# pwd /opt/sourcecode/hive-1.1.0-cdh5.7.0/packaging/target [root@hadoop001 target]# ll total 129260 drwxr-xr-x 2 root root 4096 Apr 22 21:17 antrun drwxr-xr-x 3 root root 4096 Apr 22 21:17 apache-hive-1.1.0-cdh5.7.0-bin -rw-r--r-- 1 root root 105854885 Apr 22 21:17 apache-hive-1.1.0-cdh5.7.0-bin.tar.gz -rw-r--r-- 1 root root 12656493 Apr 22 21:18 apache-hive-1.1.0-cdh5.7.0-jdbc.jar -rw-r--r-- 1 root root 13823053 Apr 22 21:18 apache-hive-1.1.0-cdh5.7.0-src.tar.gz drwxr-xr-x 2 root root 4096 Apr 22 21:17 archive-tmp drwxr-xr-x 3 root root 4096 Apr 22 21:17 maven-shared-archive-resources drwxr-xr-x 3 root root 4096 Apr 22 21:17 tmp drwxr-xr-x 2 root root 4096 Apr 22 21:17 warehouse [root@hadoop001 lib]# pwd /opt/sourcecode/hive-1.1.0-cdh5.7.0/packaging/target/apache-hive-1.1.0-cdh5.7.0-bin/apache-hive-1.1.0-cdh5.7.0-bin/lib [root@hadoop001 lib]# ll hive-exec-1.1.0-cdh5.7.0.jar -rw-r--r-- 1 root root 19272399 Apr 22 21:17 hive-exec-1.1.0-cdh5.7.0.jar
##把 hive-exec-1.1.0-cdh5.7.0.jar这个包复制到hive放这个包的位置,并把原来的删掉
[root@hadoop001 lib]#su - hadoop [hadoop@hadoop001 lib]$ pwd /home/hadoop/app/hive-1.1.0-cdh5.7.0/lib [hadoop@hadoop001 lib]$ ll hive-exec-1.1.0-cdh5.7.0.jar -rw-r--r-- 1 hadoop hadoop 19274557 Apr 21 18:54 hive-exec-1.1.0-cdh5.7.0.jar [hadoop@hadoop001 lib]$ mv hive-exec-1.1.0-cdh5.7.0.jar hive-exec-1.1.0-cdh5.7.0.jar_yuan 重名了
复制到/home/hadoop/app/hive-1.1.0-cdh5.7.0/lib/目录下
[root@hadoop001 lib]# cp hive-exec-1.1.0-cdh5.7.0.jar /home/hadoop/app/hive-1.1.0-cdh5.7.0/lib/
测试
1 hive (default)> show functions; 2 3 helloudf 4 5 hive (default)> select helloudf('zz') from dual; 6 OK 7 Hello:zz 8 Time taken: 0.922 seconds, Fetched: 1 row(s)