自行研发一个大数据运维中台(拖拉拽自定义算子自动构建成flink算子链并运行)

记录于2023年6月9日,失业的第40天
在面试不多的日子里,想着做一个小项目(大数据运维中台),挺感兴趣的,但是由于时间有限,精力有限(应付面试是大头),只搞了个大概,但是雏形已经出来了,特此记录一下!

实现的效果预览

因为没有做完,所以只是粘贴一张网图吧
在这里插入图片描述
类似这张图,做完的成品将会和这个一样,左边的列表将会是自定义的算子列表,可以拖拉拽形成新的算子链,并不需要再写代码开发算子了,然后右上角点击发布,就会自动构建flink算子链,并提交给flink进行执行。

实现的思路

在这里插入图片描述

定义平台API

开发开源的算子肯定是不能用的,只能直接提交到flink平台上,要想在自己平台拖拉拽,只能自己定义一套规则,让开发自定义算子时去遵循这套规则。

定义算子基类

public interface BaseSourceBuilder<OUT,C> {

    SingleOutputStreamOperator build(StreamExecutionEnvironment env, C config);

    Class<?> getConfiguration();

}
public interface BaseProcessBuilder<IN,OUT,C> {

    SingleOutputStreamOperator build(SingleOutputStreamOperator output, C config);

    Class<?> getConfiguration();
}
public interface BaseSinkBuilder<IN,C> {

    SingleOutputStreamOperator build(SingleOutputStreamOperator outputStreamOperator, C config);

    Class<?> getConfiguration();
}

还有一个自定义注解,用于标注自定义算子的特征信息

@Target(ElementType.TYPE)
@Retention(RetentionPolicy.RUNTIME)
public @interface ApiMessage {

    /**
     * 名称
     */
    String name() default "未命名";
    /**
     * 描述
     */
    String desc() default "开发人员太懒,什么描述也没有";
    /**
     * 图标名字
     */
    String icon();

}

定义好之后,我们就可以写自定义算子了,每个flink算子都需要实现上面的算子基类
自定义算子就是遵循上面这些规则写好的一个flink流处理任务的jar包

开发自定义算子包

输出算子(源算子)

每三个类是一套

  • TestSource 自定义算子,需要实现基类,基类的参数(输出的数据类型,自定义配置类)
  • TestSourceFunction 真实的算子实现逻辑
  • TestSourceConfig 自定义配置
@ApiMessage(
        name = "测试源算子",
        desc = "测试算子,每秒产生一个UUID输入下游",
        icon = "TestSource.png"
)
public class TestSource implements BaseSourceBuilder<String,TestSourceConfig> {
    public TestSource() {
    }

    @Override
    public SingleOutputStreamOperator build(StreamExecutionEnvironment env, TestSourceConfig config) {
        //取出配置
        String name = config.getName();
        System.out.println("配置内容:" + name);
        DataStreamSource<String> stringDataStreamSource = env.addSource(new TestSourceFunction());
        stringDataStreamSource.print();
        return stringDataStreamSource;
    }

    @Override
    public Class<?> getConfiguration() {
        return TestSourceConfig.class;
    }
}
public class TestSourceFunction implements SourceFunction<String> {
    @Override
    public void run(SourceContext<String> sourceContext) throws Exception {
        while (true){
            sourceContext.collect(UUID.randomUUID().toString());
            Thread.sleep(1000);
        }
    }

    @Override
    public void cancel() {

    }
}
public class TestSourceConfig extends BaseConfig {
    private String name = "测试数据";

    public String getName() {
        return name;
    }

    public void setName(String name) {
        this.name = name;
    }
}

处理算子

@ApiMessage(
        name = "测试处理算子",
        desc = "测试算子,给上游产生的UUID拼接字符,并发往下游",
        icon = "TestProcess.png"
)
public class TestProcess implements BaseProcessBuilder<String,String,TestProcessConfig> {

    public TestProcess() {
    }

    @Override
    public SingleOutputStreamOperator build(SingleOutputStreamOperator output, TestProcessConfig config) {
        System.out.println("打印处理算子的携带参数:" + config.getTest());
        SingleOutputStreamOperator process = output.process(new TestProcessFunction());
        return process;
    }

    @Override
    public Class<?> getConfiguration() {
        return TestProcessConfig.class;
    }


}
public class TestProcessFunction extends ProcessFunction<String, String> {
    @Override
    public void processElement(String s, ProcessFunction<String, String>.Context context, Collector<String> collector) throws Exception {
        collector.collect(s + "后置处理");
    }
}
public class TestProcessConfig extends BaseConfig {
    private String test = "测试处理算子的携带数据";

    public String getTest() {
        return test;
    }

    public void setTest(String test) {
        this.test = test;
    }
}

输入算子(sink算子)

@ApiMessage(
        name = "测试输出算子",
        desc = "测试算子,把处理算子的处理结果打印出来",
        icon = "TestSink.png"
)
public class TestSink implements BaseSinkBuilder<String,TestSinkConfig> {
    public TestSink() {
    }

    @Override
    public SingleOutputStreamOperator build(SingleOutputStreamOperator outputStreamOperator, TestSinkConfig config) {
        System.out.println("测试sink算子的携带参数:" + config.getName());
        return outputStreamOperator;
    }

    @Override
    public Class<?> getConfiguration() {
        return TestSinkConfig.class;
    }
}
public class TestSinkConfig extends BaseConfig {
    private String name = "测试输出算子的携带数据";

    public String getName() {
        return name;
    }

    public void setName(String name) {
        this.name = name;
    }
}

写完上面这三套之后,还需要在resource目录下新建META-INF/services文件夹,以接口包名+类名来命名文件,文件内容是包名+实现类名,多个实现类则换行即可
在这里插入图片描述
在这里插入图片描述

到这儿,自定义算子包就算开发完成了,打包成jar包

开发大数据运维平台

因为没有做完,所以大概说明一下即可
在这里插入图片描述

实现思路就是上图的过程
其中上传jar包,类加载器解析自定义jar包

类加载器加载三方jar包

这样加载三方包是会加载到当前项目的classpath中的,乱用类加载器会报错的,细品吧!这是个大坑!

package com.example.ftx.utils;

import java.io.IOException;
import java.lang.reflect.InvocationTargetException;
import java.lang.reflect.Method;
import java.net.MalformedURLException;
import java.net.URL;
import java.net.URLClassLoader;
import java.util.ArrayList;
import java.util.Enumeration;
import java.util.List;
import java.util.jar.JarEntry;
import java.util.jar.JarFile;

/**
 * @author fanjiangfeng
 * @date 2023/6/8 16:37
 *
 * 类加载器工具,加载外部jar文件,通过接口获取到所有实现类的class列表
 */
public class ClassLoaderUtil {

    /**
     * 利用classloader动态加载jar包:https://blog.csdn.net/zsllxbb/article/details/49902661
     *
     * 把jar加载到虚拟机中
     * @param url
     * @return
     */
    public static URLClassLoader getClassLoader(String url) {
//        URLClassLoader classLoader = new URLClassLoader(new URL[]{}, ClassLoader.getSystemClassLoader());
        URLClassLoader classLoader= (URLClassLoader) ClassLoader.getSystemClassLoader();
        try {
            Method method = URLClassLoader.class.getDeclaredMethod("addURL", new Class[]{URL.class});
            if (!method.isAccessible()) {
                method.setAccessible(true);
            }
            method.invoke(classLoader, new URL("file:" + url));
            return classLoader;
        } catch (NoSuchMethodException | MalformedURLException | InvocationTargetException | IllegalAccessException e) {
            throw new RuntimeException(e);
        }
    }

    /**
     * 加载所有类
     * @param url
     * @param classLoader
     */
    public static List<Class<?>> loadAllClass(String url, ClassLoader classLoader){
        List<Class<?>> classList = new ArrayList<>();
        List<String> classNames = new ArrayList<>();
        try (JarFile jarFile = new JarFile(url)) {
            Enumeration<JarEntry> entries = jarFile.entries();
            while (entries.hasMoreElements()) {
                JarEntry jarEntry = entries.nextElement();
                String entryName = jarEntry.getName();
                if (entryName != null && entryName.endsWith(".class")) {
                    entryName = entryName.replace("/", ".").substring(0, entryName.lastIndexOf("."));
                    classNames.add(entryName);
                }
            }
        } catch (IOException e) {
            throw new RuntimeException(e);
        }

        if (classNames.size() > 0) {
            for (String className : classNames) {
                try {
                    Class<?> theClass = classLoader.loadClass(className);
                    classList.add(theClass);
                    System.out.println(classLoader.getClass().getName() + "类加载器加载外部包的类:" + theClass.getName());
                } catch (ClassNotFoundException e) {
                    throw new RuntimeException(e);
                }
            }
        }
        return classList;
    }

    /**
     * 加载指定接口的所有实现类
     * @param url
     * @param classLoader
     * @param clazz
     * @return
     */
    public static List<Class<?>> getClassListByInterface(String url, ClassLoader classLoader, Class<?> clazz) {
        List<Class<?>> classList = new ArrayList<>();
        if (!clazz.isInterface()) {
            return classList;
        }
        List<String> classNames = new ArrayList<>();
        try (JarFile jarFile = new JarFile(url)) {
            Enumeration<JarEntry> entries = jarFile.entries();
            while (entries.hasMoreElements()) {
                JarEntry jarEntry = entries.nextElement();
                String entryName = jarEntry.getName();
                if (entryName != null && entryName.endsWith(".class")) {
                    entryName = entryName.replace("/", ".").substring(0, entryName.lastIndexOf("."));
                    classNames.add(entryName);
                }
            }
        } catch (IOException e) {
            throw new RuntimeException(e);
        }

        if (classNames.size() > 0) {
            for (String className : classNames) {
                try {
                    Class<?> theClass = classLoader.loadClass(className);
                    classList.add(theClass);
                    System.out.println("类加载器加载外部包的类:" + theClass.getName());
                } catch (ClassNotFoundException e) {
                    throw new RuntimeException(e);
                }
            }
        }
        return classList;
    }
}

测试类

测试类只是解析jar包,类加载器加载jar包,构建flink流处理任务并执行

package com.example.ftx.test;

import com.example.ftx.utils.ClassLoaderUtil;
import com.example.jax.annotataion.ApiMessage;
import com.example.jax.api.BaseProcessBuilder;
import com.example.jax.api.BaseSinkBuilder;
import com.example.jax.api.BaseSourceBuilder;
import org.apache.flink.streaming.api.datastream.SingleOutputStreamOperator;
import org.apache.flink.streaming.api.environment.StreamExecutionEnvironment;

import java.net.URLClassLoader;
import java.util.ServiceLoader;

/**
 * @author fanjiangfeng
 * @date 2023/6/8 16:34
 */
public class Test {


    public static void main(String[] args) throws Exception {
        BaseSourceBuilder sourceBuilder = null;
        BaseProcessBuilder processBuilder = null;
        BaseSinkBuilder sinkBuilder = null;

        String url = "V:\\大数据运维中台-自行研发\\jax-test-flink\\target\\jax-test-flink-0.0.1-SNAPSHOT.jar";

        //加载jar包到虚拟机
        URLClassLoader classLoader = ClassLoaderUtil.getClassLoader(url);
        ClassLoaderUtil.loadAllClass(url,classLoader);
        //加载所有源算子,并存储
        ServiceLoader<BaseSourceBuilder> sourceBuilders = ServiceLoader.load(BaseSourceBuilder.class,classLoader);
        for(BaseSourceBuilder loader:sourceBuilders){
            System.out.println(loader.getClass().getName());
            Class<? extends BaseSourceBuilder> aClass = loader.getClass();
            ApiMessage annotation = aClass.getAnnotation(ApiMessage.class);
            String desc = annotation.desc();
            System.out.println(desc);

            sourceBuilder = loader;
        }

        //加载所有处理算子,并存储
        ServiceLoader<BaseProcessBuilder> processBuilders = ServiceLoader.load(BaseProcessBuilder.class,classLoader);
        for(BaseProcessBuilder loader:processBuilders){
            System.out.println(loader.getClass().getName());
            Class<? extends BaseProcessBuilder> aClass = loader.getClass();
            ApiMessage annotation = aClass.getAnnotation(ApiMessage.class);
            String desc = annotation.desc();
            System.out.println(desc);

            processBuilder = loader;
        }

        //加载所有输出算子,并存储
        ServiceLoader<BaseSinkBuilder> sinkBuilders = ServiceLoader.load(BaseSinkBuilder.class,classLoader);
        for(BaseSinkBuilder loader:sinkBuilders){
            System.out.println(loader.getClass().getName());
            Class<? extends BaseSinkBuilder> aClass = loader.getClass();
            ApiMessage annotation = aClass.getAnnotation(ApiMessage.class);
            String desc = annotation.desc();
            System.out.println(desc);

            sinkBuilder = loader;
        }

        //构建flink运行流程
        StreamExecutionEnvironment env = StreamExecutionEnvironment.getExecutionEnvironment();
        //添加输入算子
        SingleOutputStreamOperator build = sourceBuilder.build(env, sourceBuilder.getConfiguration().newInstance());
        //添加处理算子
        SingleOutputStreamOperator build1 = processBuilder.build(build, processBuilder.getConfiguration().newInstance());
        //添加输出算子
        SingleOutputStreamOperator build2 = sinkBuilder.build(build1, sinkBuilder.getConfiguration().newInstance());

        build2.print();
        env.execute();

    }
}

测试结果如下
在这里插入图片描述
剩下的我没有精力再做了,但是按照上面那幅图的思路做,百分百没有问题!
Good Bye!!

posted @ 2023-06-09 18:56  你樊不樊  阅读(3)  评论(0)    收藏  举报