Hive 中的 UDF

LanguageManual UDF

一、分类

UDF:User defined function 用户定义函数
	一进一出
UDAF:User defined aggregation function 
	聚类函数:多进一出
	如:max min count
UDTF:User definesd table-Generating Function 
	一进多出
	如:lateral view explore

二、实战

1.创建Maven工程,修改pom.xml

hive-pom.xml

2.First, you need to create a new class that extends UDF, with one or more methods named evaluate.

创建一个类继承UDF类,实现 evaluate 方法

package com.cenzhongman.hive.udf;

import org.apache.hadoop.hive.ql.exec.UDF;
import org.apache.hadoop.io.Text;

public class LowerUDF extends UDF{

	//•Implement one or more methods named evaluate which will be called by Hive (the exact way in which Hive resolves the method to call can be configured by setting a custom UDFMethodResolver). The following are some examples: ◦public int evaluate();
	//	◦public int evaluate(int a);
	//	◦public double evaluate(int a, double b);
	//	◦public String evaluate(String a, int b, Text c);
	//	◦public Text evaluate(String a);
	//	◦public String evaluate(List<Integer> a); (Note that Hive Arrays are represented as Lists in Hive. So an ARRAY<int> column would be passed in as a List<Integer>.)
	//	•evaluate should never be a void method. However it can return null if needed. 不允许返回类型为 void 可以返回 null
	//	•Return types as well as method arguments can be either Java primitives or the corresponding Writable class.
	//  !!推荐参数使用mapReduce 的类型

	public Text evaluate(Text str) {
		//void data 
		if(str.toString() == null) {
			return null;
		}
		//lower
		return new Text(str.toString().toLowerCase());
	}
	
	//用于测试,Hive 的入口函数是 evaluate 所以没有影响
	public static void main(String[] args) {
		System.out.println(new LowerUDF().evaluate(new Text("Hive")));
	}
}

3.在 Hive 中使用自定义函数

# 添加 jar 到资源库中
add jar /opt/datas/filename.jar

# 创建临时函数
create temporary function my_lower as "com.cenzhongman.hive.udf.LowerUDF";

# 查看函数,确认添加成功
show functions;

# 使用函数
select my_lower(job) Upper_job from emp;

As of Hive 0.13, UDFs also have the option of being able to specify required jars in the CREATE FUNCTION statement:

对于新版本,有一种新的打开方式(文件需在HDFS文件系统上)

CREATE FUNCTION myfunc AS 'myclass' USING JAR 'hdfs:///path/to/jar';
posted @ 2017-07-15 15:00  岑忠满  阅读(316)  评论(0编辑  收藏  举报