Flink 合流操作——CoProcessFunction

CoProcessFunction 简介

对于连接流ConnectedStreams的处理操作,需要分别定义对两条流的处理转换,因此接口中就会有两个相同的方法需要实现,用数字“1”“2”区分,在两条流中的数据到来时分别调用。我们把这种接口叫作“协同处理函数”(co-process function)。与CoMapFunction类似,如果是调用.flatMap()就需要传入一个CoFlatMapFunction,需要实现flatMap1()、flatMap2()两个方法;而调用.process()时,传入的则是一个CoProcessFunction。抽象类CoProcessFunction在源码中定义如下:

public abstract class CoProcessFunction<IN1, IN2, OUT> extends AbstractRichFunction {

    private static final long serialVersionUID = 1L;

     * This method is called for each element in the first of the connected streams.
     * <p>This function can output zero or more elements using the {@link Collector} parameter and
     * also update internal state or set timers using the {@link Context} parameter.
     * @param value The stream element
     * @param ctx A {@link Context} that allows querying the timestamp of the element, querying the
     *     {@link TimeDomain} of the firing timer and getting a {@link TimerService} for registering
     *     timers and querying the time. The context is only valid during the invocation of this
     *     method, do not store it.
     * @param out The collector to emit resulting elements to
     * @throws Exception The function may throw exceptions which cause the streaming program to fail
     *     and go into recovery.
    public abstract void processElement1(IN1 value, Context ctx, Collector<OUT> out)
            throws Exception;

     * This method is called for each element in the second of the connected streams.
     * <p>This function can output zero or more elements using the {@link Collector} parameter and
     * also update internal state or set timers using the {@link Context} parameter.
     * @param value The stream element
     * @param ctx A {@link Context} that allows querying the timestamp of the element, querying the
     *     {@link TimeDomain} of the firing timer and getting a {@link TimerService} for registering
     *     timers and querying the time. The context is only valid during the invocation of this
     *     method, do not store it.
     * @param out The collector to emit resulting elements to
     * @throws Exception The function may throw exceptions which cause the streaming program to fail
     *     and go into recovery.
    public abstract void processElement2(IN2 value, Context ctx, Collector<OUT> out)
            throws Exception;

     * Called when a timer set using {@link TimerService} fires.
     * @param timestamp The timestamp of the firing timer.
     * @param ctx An {@link OnTimerContext} that allows querying the timestamp of the firing timer,
     *     querying the {@link TimeDomain} of the firing timer and getting a {@link TimerService}
     *     for registering timers and querying the time. The context is only valid during the
     *     invocation of this method, do not store it.
     * @param out The collector for returning result values.
     * @throws Exception This method may throw exceptions. Throwing an exception will cause the
     *     operation to fail and may trigger recovery.
    public void onTimer(long timestamp, OnTimerContext ctx, Collector<OUT> out) throws Exception {}

     * Information available in an invocation of {@link #processElement1(Object, Context,
     * Collector)}/ {@link #processElement2(Object, Context, Collector)} or {@link #onTimer(long,
     * OnTimerContext, Collector)}.
    public abstract class Context {

         * Timestamp of the element currently being processed or timestamp of a firing timer.
         * <p>This might be {@code null}, for example if the time characteristic of your program is
         * set to {@link org.apache.flink.streaming.api.TimeCharacteristic#ProcessingTime}.
        public abstract Long timestamp();

        /** A {@link TimerService} for querying time and registering timers. */
        public abstract TimerService timerService();

         * Emits a record to the side output identified by the {@link OutputTag}.
         * @param outputTag the {@code OutputTag} that identifies the side output to emit to.
         * @param value The record to emit.
        public abstract <X> void output(OutputTag<X> outputTag, X value);

     * Information available in an invocation of {@link #onTimer(long, OnTimerContext, Collector)}.
    public abstract class OnTimerContext extends Context {
        /** The {@link TimeDomain} of the firing timer. */
        public abstract TimeDomain timeDomain();



 * 实时对账 demo
public class BillCheckExample0828 {
    public static void main(String[] args) throws Exception {
        StreamExecutionEnvironment env = StreamExecutionEnvironment.getExecutionEnvironment();
        //1.1、便于测试,测试环境设置并行度为 1,生产环境记得设置为 kafka topic 的分区数
        //2、读取数据 并 声明水位线
        //2.1、模拟来自app 的数据 appStream
        SingleOutputStreamOperator<Tuple3<String, String, Long>> appStream = env.fromElements(
                Tuple3.of("order-1", "app", 1000L),
                Tuple3.of("order-2", "app", 2000L),
                Tuple3.of("order-3", "app", 3500L)
        ).assignTimestampsAndWatermarks(WatermarkStrategy.<Tuple3<String, String, Long>>forBoundedOutOfOrderness(Duration.ZERO)
                .withTimestampAssigner(new SerializableTimestampAssigner<Tuple3<String, String, Long>>() {
                    public long extractTimestamp(Tuple3<String, String, Long> element, long recordTimestamp) {
                        return element.f2;

        SingleOutputStreamOperator<Tuple4<String, String, String, Long>> thirdPartStream = env.fromElements(
                Tuple4.of("order-1", "third-party", "success", 3000L),
                Tuple4.of("order-3", "third-party", "success", 4000L)
        ).assignTimestampsAndWatermarks(WatermarkStrategy.<Tuple4<String, String, String, Long>>forBoundedOutOfOrderness(Duration.ZERO)
                .withTimestampAssigner(new SerializableTimestampAssigner<Tuple4<String, String, String, Long>>() {
                    public long extractTimestamp(Tuple4<String, String, String, Long> element, long recordTimestamp) {
                        return element.f3;
        //3、调用实现 CoProcessFunction 的静态类 检查同一支付单,是否两条流种是否匹配
        appStream.connect(thirdPartStream).keyBy(data -> data.f0, data -> data.f0)
                .process(new OrderMatchResult0828())


     * 自定义实现 CoProcessFunction
    public static class OrderMatchResult0828 extends CoProcessFunction<Tuple3<String, String, Long>, Tuple4<String, String, String, Long>, String> {

        private ValueState<Tuple3<String, String, Long>> appEventState;
        private ValueState<Tuple4<String, String, String, Long>> thirdPartyEventState;

        public void open(Configuration parameters) throws Exception {
            appEventState = getRuntimeContext().getState(
                    new ValueStateDescriptor<Tuple3<String, String, Long>>("app-state", Types.TUPLE(Types.STRING, Types.STRING, Types.LONG))

            thirdPartyEventState = getRuntimeContext().getState(
                    new ValueStateDescriptor<Tuple4<String, String, String, Long>>("thirt-party-state", Types.TUPLE(Types.STRING, Types.STRING, Types.STRING, Types.LONG))

        public void processElement1(Tuple3<String, String, Long> value, Context ctx, Collector<String> out) throws Exception {
            //来的时 app 数据,查看 第三方数据是否来过
            if (thirdPartyEventState.value() != null) {
                out.collect("对账成功:" + value + " " + thirdPartyEventState.value());
            } else {
                //更新状态 更新 app
                ctx.timerService().registerEventTimeTimer(value.f2 + 5000L); //等待 5s

        public void processElement2(Tuple4<String, String, String, Long> value, Context ctx, Collector<String> out) throws Exception {
            //来的时 app 数据,查看 第三方数据是否来过
            if (appEventState.value() != null) {
                out.collect("对账成功:" + appEventState.value() + " " + value);
            } else {
                //更新状态 更新 app
                ctx.timerService().registerEventTimeTimer(value.f3 + 5000L); //等待 5s

        public void onTimer(long timestamp, OnTimerContext ctx, Collector<String> out) throws Exception {
            if (appEventState.value() != null) {
                out.collect("对账失败 " + appEventState.value() + " 第三方差数据");
            if (thirdPartyEventState.value() != null) {
                out.collect("对账失败 " + thirdPartyEventState.value() + " app差数据");




对账成功:(order-1,app,1000) (order-1,third-party,success,3000)
对账成功:(order-3,app,3500) (order-3,third-party,success,4000)
对账失败 (order-2,app,2000) 第三方差数据
posted @   晓枫的春天  阅读(644)  评论(0编辑  收藏  举报
· 无需6万激活码!GitHub神秘组织3小时极速复刻Manus,手把手教你使用OpenManus搭建本
· C#/.NET/.NET Core优秀项目和框架2025年2月简报
· 一文读懂知识蒸馏
· Manus爆火,是硬核还是营销?
· 终于写完轮子一部分:tcp代理 了,记录一下