
在Flink程序中,为了实现数据的聚合统计,或者开窗计算之类的功能,一般都要先用keyBy算子对数据流进行“按键分区”,得到一个KeyedStream。也就是指定一个键(key),按照它的哈希值(hash code)将数据分成不同的“组”,然后分配到不同的并行子任务上执行计算;这相当于做了一个逻辑分流的操作,从而可以充分利用并行计算的优势实时处理海量数据。另外只有在KeyedStream中才支持使用TimerService设置定时器的操作。所以一般情况下,我们都是先做了keyBy分区之后,再去定义处理操作.



public abstract TimerService timerService();


/** Interface for working with time and timers. */
public interface TimerService {

    /** Error string for {@link UnsupportedOperationException} on registering timers. */
    String UNSUPPORTED_REGISTER_TIMER_MSG = "Setting timers is only supported on a keyed streams.";

    /** Error string for {@link UnsupportedOperationException} on deleting timers. */
    String UNSUPPORTED_DELETE_TIMER_MSG = "Deleting timers is only supported on a keyed streams.";

    /** Returns the current processing time. */
    long currentProcessingTime();

    /** Returns the current event-time watermark. */
    long currentWatermark();

     * Registers a timer to be fired when processing time passes the given time.
     * <p>Timers can internally be scoped to keys and/or windows. When you set a timer in a keyed
     * context, such as in an operation on {@link
     * org.apache.flink.streaming.api.datastream.KeyedStream} then that context will also be active
     * when you receive the timer notification.
    void registerProcessingTimeTimer(long time);

     * Registers a timer to be fired when the event time watermark passes the given time.
     * <p>Timers can internally be scoped to keys and/or windows. When you set a timer in a keyed
     * context, such as in an operation on {@link
     * org.apache.flink.streaming.api.datastream.KeyedStream} then that context will also be active
     * when you receive the timer notification.
    void registerEventTimeTimer(long time);

     * Deletes the processing-time timer with the given trigger time. This method has only an effect
     * if such a timer was previously registered and did not already expire.
     * <p>Timers can internally be scoped to keys and/or windows. When you delete a timer, it is
     * removed from the current keyed context.
    void deleteProcessingTimeTimer(long time);

     * Deletes the event-time timer with the given trigger time. This method has only an effect if
     * such a timer was previously registered and did not already expire.
     * <p>Timers can internally be scoped to keys and/or windows. When you delete a timer, it is
     * removed from the current keyed context.
    void deleteEventTimeTimer(long time);


ong coalescedTime = time /1000 * 1000;




stream.keyBy( t -> t.f0 ).process(new MyKeyedProcessFunction())


     * Process one element from the input stream.
     * <p>This function can output zero or more elements using the {@link Collector} parameter and
     * also update internal state or set timers using the {@link Context} parameter.
     * @param value The input value.
     * @param ctx A {@link Context} that allows querying the timestamp of the element and getting a
     *     {@link TimerService} for registering timers and querying the time. The context is only
     *     valid during the invocation of this method, do not store it.
     * @param out The collector for returning result values.
     * @throws Exception This method may throw exceptions. Throwing an exception will cause the
     *     operation to fail and may trigger recovery.
    public abstract void processElement(I value, Context ctx, Collector<O> out) throws Exception;

     * Called when a timer set using {@link TimerService} fires.
     * @param timestamp The timestamp of the firing timer.
     * @param ctx An {@link OnTimerContext} that allows querying the timestamp, the {@link
     *     TimeDomain}, and the key of the firing timer and getting a {@link TimerService} for
     *     registering timers and querying the time. The context is only valid during the invocation
     *     of this method, do not store it.
     * @param out The collector for returning result values.
     * @throws Exception This method may throw exceptions. Throwing an exception will cause the
     *     operation to fail and may trigger recovery.
    public void onTimer(long timestamp, OnTimerContext ctx, Collector<O> out) throws Exception {}


    public static void main(String[] args) throws Exception {
        StreamExecutionEnvironment env = StreamExecutionEnvironment.getExecutionEnvironment();
        DataStreamSource<Event> eventDS = env.addSource(new ClickSource());

        eventDS.keyBy(data -> data.user)
                //KeyedProcessFunction<String, Event, String>( K I O )
                .process(new KeyedProcessFunction<String, Event, String>() {
                    public void processElement(Event value, Context ctx, Collector<String> out) throws Exception {
                        long currTs = ctx.timerService().currentProcessingTime(); // 处理时间
                        out.collect(ctx.getCurrentKey() + " 数据到达时间 -> " + new Timestamp(currTs));
                        //注册 10s 的定时器:处理时间定时器
                        ctx.timerService().registerProcessingTimeTimer(currTs + 10 * 1000);

                    public void onTimer(long timestamp, OnTimerContext ctx, Collector<String> out) throws Exception {
                        out.collect(ctx.getCurrentKey() + " 定时器触发时间 -> " + new Timestamp(timestamp));

在上面的代码中,由于定时器只能在KeyedStream上使用,所以先要进行keyBy;这里的.keyBy(data-> true)是将所有数据的key都指定为了true,其实就是所有数据拥有相同的key,会分配到同一个分区。之后自定义了一个KeyedProcessFunction,其中.processElement()方法是每来一个数据都会调用一次,主要是定义了一个10秒之后的定时器;而.onTimer()方法则会在定时器触发时调用。所以会看到,程序运行后先在控制台输出“数据到达”的信息,等待10秒之后,又会输出“定时器触发”的信息,打印出的时间间隔正是10秒。当然,上面的例子是处理时间的定时器,所以是真的需要等待10秒才会看到结果。事件时间语义下,又会有什么不同呢?可以对上面的代码略作修改,做一个测试:

    public static void main(String[] args) throws Exception {
        StreamExecutionEnvironment env = StreamExecutionEnvironment.getExecutionEnvironment();
        SingleOutputStreamOperator<Event> eventDS = env.addSource(new ClickSource())
                                .withTimestampAssigner(new SerializableTimestampAssigner<Event>() {
                                    public long extractTimestamp(Event element, long recordTimestamp) {
                                        return element.timestamp;

        eventDS.keyBy(data -> data.user)
                .process(new KeyedProcessFunction<String, Event, String>() {
                    public void processElement(Event value, Context ctx, Collector<String> out) throws Exception {
                        long currTs = ctx.timerService().currentWatermark();
                        out.collect(ctx.getCurrentKey() + " 数据到时间戳 -> " + new Timestamp(currTs) + " watermaker " + ctx.timerService().currentWatermark());
                        //注册 10s 的定时器
                        ctx.timerService().registerEventTimeTimer(currTs + 10 * 1000);


                    public void onTimer(long timestamp, OnTimerContext ctx, Collector<String> out) throws Exception {
                        out.collect(ctx.getCurrentKey() + " 定时器触发时间 -> " + new Timestamp(timestamp));

令狐冲 数据到时间戳 -> 292278994-08-17 15:12:55.192 watermaker -9223372036854775808
令狐冲 定时器触发时间 -> 292269055-12-03 00:47:14.192
任盈盈 数据到时间戳 -> 2022-07-06 12:31:55.509 watermaker 1657081915509
莫大 数据到时间戳 -> 2022-07-06 12:31:56.521 watermaker 1657081916521
依琳 数据到时间戳 -> 2022-07-06 12:31:57.529 watermaker 1657081917529
任盈盈 数据到时间戳 -> 2022-07-06 12:31:58.53 watermaker 1657081918530
令狐冲 数据到时间戳 -> 2022-07-06 12:31:59.545 watermaker 1657081919545
风清扬 数据到时间戳 -> 2022-07-06 12:32:00.55 watermaker 1657081920550
风清扬 数据到时间戳 -> 2022-07-06 12:32:01.551 watermaker 1657081921551
依琳 数据到时间戳 -> 2022-07-06 12:32:02.554 watermaker 1657081922554
风清扬 数据到时间戳 -> 2022-07-06 12:32:03.567 watermaker 1657081923567
任盈盈 数据到时间戳 -> 2022-07-06 12:32:04.577 watermaker 1657081924577
任盈盈 定时器触发时间 -> 2022-07-06 12:32:05.509
任盈盈 数据到时间戳 -> 2022-07-06 12:32:05.587 watermaker 1657081925587
莫大 定时器触发时间 -> 2022-07-06 12:32:06.521
风清扬 数据到时间戳 -> 2022-07-06 12:32:06.599 watermaker 1657081926599
依琳 定时器触发时间 -> 2022-07-06 12:32:07.529
莫大 数据到时间戳 -> 2022-07-06 12:32:07.61 watermaker 1657081927610
任盈盈 定时器触发时间 -> 2022-07-06 12:32:08.53
莫大 数据到时间戳 -> 2022-07-06 12:32:08.623 watermaker 1657081928623
令狐冲 定时器触发时间 -> 2022-07-06 12:32:09.545
依琳 数据到时间戳 -> 2022-07-06 12:32:09.633 watermaker 1657081929633
风清扬 定时器触发时间 -> 2022-07-06 12:32:10.55
令狐冲 数据到时间戳 -> 2022-07-06 12:32:10.643 watermaker 1657081930643
风清扬 定时器触发时间 -> 2022-07-06 12:32:11.551
风清扬 数据到时间戳 -> 2022-07-06 12:32:11.653 watermaker 1657081931653
依琳 定时器触发时间 -> 2022-07-06 12:32:12.554

每来一条数据,都会输出两行“数据到达”的信息,并以分割线隔开;两条数据到达的时间间隔为5秒。当第三条数据到达后,随后立即输出一条定时器触发的信息;再过5秒之后,剩余两条定时器信息输出,程序运行结束。可以发现,数据到来之后,当前的水位线与时间戳并不是一致的。当第一条数据到来,时间戳为1000,可水位线的生成是周期性的(默认200ms一次),不会立即发生改变,所以依然是最小值Long.MIN_VALUE;随后只要到了水位线生成的时间点(200ms到了),就会依据当前的最大时间戳1000来生成水位线了。这里没有设置水位线延迟,默认需要减去1毫秒,所以水位线推进到了999。而当时间戳为11000的第二条数据到来之后,水位线同样没有立即改变,仍然是999,就好像总是“滞后”数据一样。这样程序的行为就可以得到合理解释了。事件时间语义下,定时器触发的条件就是水位线推进到设定的时间。第一条数据到来后,设定的定时器时间为1000 + 10 * 1000 = 11000;而当时间戳为11000的第二条数据到来,水位线还处在999的位置,当然不会立即触发定时器;而之后水位线会推进到10999,同样是无法触发定时器的。必须等到第三条数据到来,将水位线真正推进到11000,就可以触发第一个定时器了。第三条数据发出后再过5秒,没有更多的数据生成了,整个程序运行结束将要退出,此时Flink会自动将水位线推进到长整型的最大值(Long.MAX_VALUE)。于是所有尚未触发的定时器这时就统一触发了,就在控制台看到了后两个定时器的触发信息。

