002.hive-UDF自定义函数
IDEA
配置文件 pom.xml
<?xml version="1.0" encoding="UTF-8"?> <project xmlns="http://maven.apache.org/POM/4.0.0" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 http://maven.apache.org/xsd/maven-4.0.0.xsd"> <modelVersion>4.0.0</modelVersion> <groupId>com.star</groupId> <artifactId>scala_demo</artifactId> <version>1.0-SNAPSHOT</version> <dependencies> <dependency> <groupId>org.apache.hadoop</groupId> <artifactId>hadoop-client</artifactId> <version>2.7.3</version> </dependency> <dependency> <groupId>junit</groupId> <artifactId>junit</artifactId> <version>4.11</version> </dependency> <dependency> <groupId>org.apache.hive</groupId> <artifactId>hive-jdbc</artifactId> <version>2.1.0</version> </dependency> <dependency> <groupId>org.apache.hive</groupId> <artifactId>hive-exec</artifactId> <version>2.1.0</version> </dependency> <dependency> <groupId>org.slf4j</groupId> <artifactId>slf4j-simple</artifactId> <version>1.7.25</version> <scope>compile</scope> </dependency> <dependency> <groupId>junit</groupId> <artifactId>junit</artifactId> <version>4.12</version> <scope>test</scope> </dependency> </dependencies> </project>
java 注意方法名一定要用这个 evaluate
import org.apache.hadoop.hive.ql.exec.UDF; import org.apache.hadoop.hive.ql.exec.Description; @Description( name = "myudf", value = "this is a function", extended = "eg:select myudf(1,2) = > 1+2,select myudf(\"hello\",\"world\")=>helloworld" ) public class MyUDF extends UDF{ public int evaluate(int i,int j){ return i+j; } public String evaluate(String a ,String b){ return a+b; } }
打jar包
jar 上传到 /soft/apache-hive-2.1.1-bin/lib
重启hive服务
[centos@s101 /home/centos]$hiveserver2
连接hive
[centos@s101 /home/centos]$beeline -u jdbc:hive2://localhost:10000
注册函数
0: jdbc:hive2://localhost:10000> show databases;
+----------------+--+
| database_name |
+----------------+--+
| default |
| mydb |
+----------------+--+
2 rows selected (4.71 seconds)
0: jdbc:hive2://localhost:10000> use mydb;
0: jdbc:hive2://localhost:10000> create function myudf as 'MyUDF';
No rows affected (0.839 seconds)
0: jdbc:hive2://localhost:10000> show functions;
0: jdbc:hive2://localhost:10000> use mydb
. . . . . . . . . . . . . . . .> ;
No rows affected (14.597 seconds)
0: jdbc:hive2://localhost:10000> create function myudf as 'MyUDF';
No rows affected (0.464 seconds)
0: jdbc:hive2://localhost:10000> desc function extended myudf;
+-------------------------------------------------------------------------+--+
| tab_name |
+-------------------------------------------------------------------------+--+
| this is a function |
| Synonyms: mydb.myudf |
| eg:select myudf(1,2) = > 1+2,select myudf("hello","world")=>helloworld |
+-------------------------------------------------------------------------+--+
3 rows selected (4.17 seconds)
0: jdbc:hive2://localhost:10000> select mydb.myudf("df","df");
+-------+--+
| _c0 |
+-------+--+
| dfdf |
+-------+--+
1 row selected (4.564 seconds)
【推荐】国内首个AI IDE,深度理解中文开发场景,立即下载体验Trae
【推荐】编程新体验,更懂你的AI,立即体验豆包MarsCode编程助手
【推荐】抖音旗下AI助手豆包,你的智能百科全书,全免费不限次数
【推荐】轻量又高性能的 SSH 工具 IShell:AI 加持,快人一步
· Linux系列:如何用heaptrack跟踪.NET程序的非托管内存泄露
· 开发者必知的日志记录最佳实践
· SQL Server 2025 AI相关能力初探
· Linux系列:如何用 C#调用 C方法造成内存泄露
· AI与.NET技术实操系列(二):开始使用ML.NET
· 无需6万激活码!GitHub神秘组织3小时极速复刻Manus,手把手教你使用OpenManus搭建本
· C#/.NET/.NET Core优秀项目和框架2025年2月简报
· 一文读懂知识蒸馏
· Manus爆火,是硬核还是营销?
· 终于写完轮子一部分:tcp代理 了,记录一下