hive的udf函数
首先是要引入依赖
<dependencies> <dependency> <groupId>org.apache.hive</groupId> <artifactId>hive-exec</artifactId> <version>1.2.1</version> </dependency> <!-- https://mvnrepository.com/artifact/org.apache.hadoop/hadoop-common --> <dependency> <groupId>org.apache.hadoop</groupId> <artifactId>hadoop-common</artifactId> <version>2.7.4</version> </dependency> </dependencies>
在maven的conf/settings设置源:阿里云的是一直在用的,我是直接加了进去,可以直接用。
<mirrors> <mirror> <id>mirror</id> <mirrorOf>!rdc-releases,!rdc-snapshots</mirrorOf> <name>mirror</name> <url>http://maven.aliyun.com/nexus/content/groups/public</url> </mirror> <mirror> <id>nexus-osc</id> <mirrorOf>central</mirrorOf> <name>Nexus osc</name> <url>http://maven.oschina.net/content/groups/public/</url> </mirror> </mirrors>
一定要继承UDF接口,重写evaluate方法,名字一定是evaluate;
public class HelloUDF extends UDF { public Text evaluate( Text s) { if (s == null) { return null; } StringBuilder ret = new StringBuilder(); String[] items = s.toString().split("\\."); if (items.length != 4){ return null; } for (String item : items) { StringBuilder sb = new StringBuilder(); int a = Integer.parseInt(item); for (int i=0; i<8; i++) { sb.insert(0, a%2); a = a/2; } ret.append(sb); } return new Text(ret.toString()); }
maven打包,clean,complie,package;
复制路径,选择类,点击copyprefence;
进入hive,把jar包上传到一个路径,在hive里添加jar包:
add jar /路径/xxx.jar
list jar;展示添加的jar包,
show functions,展示列表的函数;
创建临时函数
语法:CREATE TEMPORARY FUNCTION function_name AS class_name;
function_name函数名
class_name 类路径,包名+类名(复制的路径,就是copy prefence)
hive> create temporary function ipcast 'HelloUDF'; OK Time taken: 0.087 seconds hive>
删除临时函数
语法:DROP TEMPORARY FUNCTION [IF EXISTS] function_name;
hive> DROP TEMPORARY FUNCTION IF EXISTS ipcast; OK Time taken: 0.015 seconds hive>