自行研发一个大数据运维中台(拖拉拽自定义算子自动构建成flink算子链并运行)
记录于2023年6月9日,失业的第40天
在面试不多的日子里,想着做一个小项目(大数据运维中台),挺感兴趣的,但是由于时间有限,精力有限(应付面试是大头),只搞了个大概,但是雏形已经出来了,特此记录一下!
实现的效果预览
因为没有做完,所以只是粘贴一张网图吧
类似这张图,做完的成品将会和这个一样,左边的列表将会是自定义的算子列表,可以拖拉拽形成新的算子链,并不需要再写代码开发算子了,然后右上角点击发布,就会自动构建flink算子链,并提交给flink进行执行。
实现的思路
定义平台API
开发开源的算子肯定是不能用的,只能直接提交到flink平台上,要想在自己平台拖拉拽,只能自己定义一套规则,让开发自定义算子时去遵循这套规则。
定义算子基类
public interface BaseSourceBuilder<OUT,C> {
SingleOutputStreamOperator build(StreamExecutionEnvironment env, C config);
Class<?> getConfiguration();
}
public interface BaseProcessBuilder<IN,OUT,C> {
SingleOutputStreamOperator build(SingleOutputStreamOperator output, C config);
Class<?> getConfiguration();
}
public interface BaseSinkBuilder<IN,C> {
SingleOutputStreamOperator build(SingleOutputStreamOperator outputStreamOperator, C config);
Class<?> getConfiguration();
}
还有一个自定义注解,用于标注自定义算子的特征信息
@Target(ElementType.TYPE)
@Retention(RetentionPolicy.RUNTIME)
public @interface ApiMessage {
/**
* 名称
*/
String name() default "未命名";
/**
* 描述
*/
String desc() default "开发人员太懒,什么描述也没有";
/**
* 图标名字
*/
String icon();
}
定义好之后,我们就可以写自定义算子了,每个flink算子都需要实现上面的算子基类
自定义算子就是遵循上面这些规则写好的一个flink流处理任务的jar包
开发自定义算子包
输出算子(源算子)
每三个类是一套
- TestSource 自定义算子,需要实现基类,基类的参数(输出的数据类型,自定义配置类)
- TestSourceFunction 真实的算子实现逻辑
- TestSourceConfig 自定义配置
@ApiMessage(
name = "测试源算子",
desc = "测试算子,每秒产生一个UUID输入下游",
icon = "TestSource.png"
)
public class TestSource implements BaseSourceBuilder<String,TestSourceConfig> {
public TestSource() {
}
@Override
public SingleOutputStreamOperator build(StreamExecutionEnvironment env, TestSourceConfig config) {
//取出配置
String name = config.getName();
System.out.println("配置内容:" + name);
DataStreamSource<String> stringDataStreamSource = env.addSource(new TestSourceFunction());
stringDataStreamSource.print();
return stringDataStreamSource;
}
@Override
public Class<?> getConfiguration() {
return TestSourceConfig.class;
}
}
public class TestSourceFunction implements SourceFunction<String> {
@Override
public void run(SourceContext<String> sourceContext) throws Exception {
while (true){
sourceContext.collect(UUID.randomUUID().toString());
Thread.sleep(1000);
}
}
@Override
public void cancel() {
}
}
public class TestSourceConfig extends BaseConfig {
private String name = "测试数据";
public String getName() {
return name;
}
public void setName(String name) {
this.name = name;
}
}
处理算子
@ApiMessage(
name = "测试处理算子",
desc = "测试算子,给上游产生的UUID拼接字符,并发往下游",
icon = "TestProcess.png"
)
public class TestProcess implements BaseProcessBuilder<String,String,TestProcessConfig> {
public TestProcess() {
}
@Override
public SingleOutputStreamOperator build(SingleOutputStreamOperator output, TestProcessConfig config) {
System.out.println("打印处理算子的携带参数:" + config.getTest());
SingleOutputStreamOperator process = output.process(new TestProcessFunction());
return process;
}
@Override
public Class<?> getConfiguration() {
return TestProcessConfig.class;
}
}
public class TestProcessFunction extends ProcessFunction<String, String> {
@Override
public void processElement(String s, ProcessFunction<String, String>.Context context, Collector<String> collector) throws Exception {
collector.collect(s + "后置处理");
}
}
public class TestProcessConfig extends BaseConfig {
private String test = "测试处理算子的携带数据";
public String getTest() {
return test;
}
public void setTest(String test) {
this.test = test;
}
}
输入算子(sink算子)
@ApiMessage(
name = "测试输出算子",
desc = "测试算子,把处理算子的处理结果打印出来",
icon = "TestSink.png"
)
public class TestSink implements BaseSinkBuilder<String,TestSinkConfig> {
public TestSink() {
}
@Override
public SingleOutputStreamOperator build(SingleOutputStreamOperator outputStreamOperator, TestSinkConfig config) {
System.out.println("测试sink算子的携带参数:" + config.getName());
return outputStreamOperator;
}
@Override
public Class<?> getConfiguration() {
return TestSinkConfig.class;
}
}
public class TestSinkConfig extends BaseConfig {
private String name = "测试输出算子的携带数据";
public String getName() {
return name;
}
public void setName(String name) {
this.name = name;
}
}
写完上面这三套之后,还需要在resource目录下新建META-INF/services文件夹,以接口包名+类名来命名文件,文件内容是包名+实现类名,多个实现类则换行即可
到这儿,自定义算子包就算开发完成了,打包成jar包
开发大数据运维平台
因为没有做完,所以大概说明一下即可
实现思路就是上图的过程
其中上传jar包,类加载器解析自定义jar包
类加载器加载三方jar包
这样加载三方包是会加载到当前项目的classpath中的,乱用类加载器会报错的,细品吧!这是个大坑!
package com.example.ftx.utils;
import java.io.IOException;
import java.lang.reflect.InvocationTargetException;
import java.lang.reflect.Method;
import java.net.MalformedURLException;
import java.net.URL;
import java.net.URLClassLoader;
import java.util.ArrayList;
import java.util.Enumeration;
import java.util.List;
import java.util.jar.JarEntry;
import java.util.jar.JarFile;
/**
* @author fanjiangfeng
* @date 2023/6/8 16:37
*
* 类加载器工具,加载外部jar文件,通过接口获取到所有实现类的class列表
*/
public class ClassLoaderUtil {
/**
* 利用classloader动态加载jar包:https://blog.csdn.net/zsllxbb/article/details/49902661
*
* 把jar加载到虚拟机中
* @param url
* @return
*/
public static URLClassLoader getClassLoader(String url) {
// URLClassLoader classLoader = new URLClassLoader(new URL[]{}, ClassLoader.getSystemClassLoader());
URLClassLoader classLoader= (URLClassLoader) ClassLoader.getSystemClassLoader();
try {
Method method = URLClassLoader.class.getDeclaredMethod("addURL", new Class[]{URL.class});
if (!method.isAccessible()) {
method.setAccessible(true);
}
method.invoke(classLoader, new URL("file:" + url));
return classLoader;
} catch (NoSuchMethodException | MalformedURLException | InvocationTargetException | IllegalAccessException e) {
throw new RuntimeException(e);
}
}
/**
* 加载所有类
* @param url
* @param classLoader
*/
public static List<Class<?>> loadAllClass(String url, ClassLoader classLoader){
List<Class<?>> classList = new ArrayList<>();
List<String> classNames = new ArrayList<>();
try (JarFile jarFile = new JarFile(url)) {
Enumeration<JarEntry> entries = jarFile.entries();
while (entries.hasMoreElements()) {
JarEntry jarEntry = entries.nextElement();
String entryName = jarEntry.getName();
if (entryName != null && entryName.endsWith(".class")) {
entryName = entryName.replace("/", ".").substring(0, entryName.lastIndexOf("."));
classNames.add(entryName);
}
}
} catch (IOException e) {
throw new RuntimeException(e);
}
if (classNames.size() > 0) {
for (String className : classNames) {
try {
Class<?> theClass = classLoader.loadClass(className);
classList.add(theClass);
System.out.println(classLoader.getClass().getName() + "类加载器加载外部包的类:" + theClass.getName());
} catch (ClassNotFoundException e) {
throw new RuntimeException(e);
}
}
}
return classList;
}
/**
* 加载指定接口的所有实现类
* @param url
* @param classLoader
* @param clazz
* @return
*/
public static List<Class<?>> getClassListByInterface(String url, ClassLoader classLoader, Class<?> clazz) {
List<Class<?>> classList = new ArrayList<>();
if (!clazz.isInterface()) {
return classList;
}
List<String> classNames = new ArrayList<>();
try (JarFile jarFile = new JarFile(url)) {
Enumeration<JarEntry> entries = jarFile.entries();
while (entries.hasMoreElements()) {
JarEntry jarEntry = entries.nextElement();
String entryName = jarEntry.getName();
if (entryName != null && entryName.endsWith(".class")) {
entryName = entryName.replace("/", ".").substring(0, entryName.lastIndexOf("."));
classNames.add(entryName);
}
}
} catch (IOException e) {
throw new RuntimeException(e);
}
if (classNames.size() > 0) {
for (String className : classNames) {
try {
Class<?> theClass = classLoader.loadClass(className);
classList.add(theClass);
System.out.println("类加载器加载外部包的类:" + theClass.getName());
} catch (ClassNotFoundException e) {
throw new RuntimeException(e);
}
}
}
return classList;
}
}
测试类
测试类只是解析jar包,类加载器加载jar包,构建flink流处理任务并执行
package com.example.ftx.test;
import com.example.ftx.utils.ClassLoaderUtil;
import com.example.jax.annotataion.ApiMessage;
import com.example.jax.api.BaseProcessBuilder;
import com.example.jax.api.BaseSinkBuilder;
import com.example.jax.api.BaseSourceBuilder;
import org.apache.flink.streaming.api.datastream.SingleOutputStreamOperator;
import org.apache.flink.streaming.api.environment.StreamExecutionEnvironment;
import java.net.URLClassLoader;
import java.util.ServiceLoader;
/**
* @author fanjiangfeng
* @date 2023/6/8 16:34
*/
public class Test {
public static void main(String[] args) throws Exception {
BaseSourceBuilder sourceBuilder = null;
BaseProcessBuilder processBuilder = null;
BaseSinkBuilder sinkBuilder = null;
String url = "V:\\大数据运维中台-自行研发\\jax-test-flink\\target\\jax-test-flink-0.0.1-SNAPSHOT.jar";
//加载jar包到虚拟机
URLClassLoader classLoader = ClassLoaderUtil.getClassLoader(url);
ClassLoaderUtil.loadAllClass(url,classLoader);
//加载所有源算子,并存储
ServiceLoader<BaseSourceBuilder> sourceBuilders = ServiceLoader.load(BaseSourceBuilder.class,classLoader);
for(BaseSourceBuilder loader:sourceBuilders){
System.out.println(loader.getClass().getName());
Class<? extends BaseSourceBuilder> aClass = loader.getClass();
ApiMessage annotation = aClass.getAnnotation(ApiMessage.class);
String desc = annotation.desc();
System.out.println(desc);
sourceBuilder = loader;
}
//加载所有处理算子,并存储
ServiceLoader<BaseProcessBuilder> processBuilders = ServiceLoader.load(BaseProcessBuilder.class,classLoader);
for(BaseProcessBuilder loader:processBuilders){
System.out.println(loader.getClass().getName());
Class<? extends BaseProcessBuilder> aClass = loader.getClass();
ApiMessage annotation = aClass.getAnnotation(ApiMessage.class);
String desc = annotation.desc();
System.out.println(desc);
processBuilder = loader;
}
//加载所有输出算子,并存储
ServiceLoader<BaseSinkBuilder> sinkBuilders = ServiceLoader.load(BaseSinkBuilder.class,classLoader);
for(BaseSinkBuilder loader:sinkBuilders){
System.out.println(loader.getClass().getName());
Class<? extends BaseSinkBuilder> aClass = loader.getClass();
ApiMessage annotation = aClass.getAnnotation(ApiMessage.class);
String desc = annotation.desc();
System.out.println(desc);
sinkBuilder = loader;
}
//构建flink运行流程
StreamExecutionEnvironment env = StreamExecutionEnvironment.getExecutionEnvironment();
//添加输入算子
SingleOutputStreamOperator build = sourceBuilder.build(env, sourceBuilder.getConfiguration().newInstance());
//添加处理算子
SingleOutputStreamOperator build1 = processBuilder.build(build, processBuilder.getConfiguration().newInstance());
//添加输出算子
SingleOutputStreamOperator build2 = sinkBuilder.build(build1, sinkBuilder.getConfiguration().newInstance());
build2.print();
env.execute();
}
}
测试结果如下
剩下的我没有精力再做了,但是按照上面那幅图的思路做,百分百没有问题!
Good Bye!!
-------------------------------------------
个性签名:独学而无友,则孤陋而寡闻。做一个灵魂有趣的人!
如果觉得这篇文章对你有小小的帮助的话,记得在右下角点个“推荐”哦,博主在此感谢!
万水千山总是情,打赏一分行不行,所以如果你心情还比较高兴,也是可以扫码打赏博主,哈哈哈(っ•̀ω•́)っ✎⁾⁾!