摘要:CEP,Complex event processing Wiki定义 “Complex event processing, or CEP, is event processing that combines data from multiple sources[2] to infer events
阅读全文
摘要:维度表,作为数据仓库里面的概念,是维度属性的集合,比如时间维、地点维; 但这里要讨论流计算中的维度表问题, 流计算中维表问题和数据仓库中有所不同,往往是因为通过agent采集到的数据比较有限,在做数据业务的时候,需要先实时的把这些维度信息给补全; 这个问题其实就是,主数据流和多个静态表或半静态表之间的join问题。 在flink中称为side input问题,https://cwiki.a...
阅读全文
摘要:WindowOperator processElement 如果clear只是简单的注册EventTimeTimer,那么在onEventTime的时候一定有clear的逻辑、 WindowOperator.onEventTime 果然,onEventTime的时候会判断,如果Timer的time等
阅读全文
摘要:WindowOperator.processElement 主要的工作,将当前的element的value加到对应的window中, windowState.setCurrentNamespace(window); windowState.add(element.getValue()); triggerContex...
阅读全文
摘要:初始化 Task List consumedPartitions = tdd.getInputGates(); // Consumed intermediate result partitions this.inputGates = new SingleInputGate[consumedPartitions.size()]; this.inputGatesById = new Has...
阅读全文
摘要:发送数据一般通过,collector.collect public interface Collector { /** * Emits a record. * * @param record The record to collect. */ void collect(T record); /** ...
阅读全文
摘要:/* {@code * DataStream stream = ...; * KeyedStream keyedStream = stream.keyBy("id"); * * keyedStream.map(new RichMapFunction>() { * * private ValueState count;...
阅读全文
摘要:看看Flink cep如何将pattern转换为NFA? 当来了一条event,如果在NFA中执行的? 前面的链路,CEP –> PatternStream –> select –> CEPOperatorUtils.createPatternStream 1. 产生NFACompiler.compileFactory,完成pattern到state的转换final NFACompiler...
阅读全文
摘要:https://ci.apache.org/projects/flink/flink-docs-release-1.3/dev/libs/cep.html 首先目的是匹配pattern sequence pattern Sequence是由多个pattern构成 DataStream input = ... Pattern pattern = Pattern.begin("start").w...
阅读全文
摘要:使用方式, dataStream.coGroup(otherStream) .where(0).equalTo(1) .window(TumblingEventTimeWindows.of(Time.seconds(3))) .apply (new CoGroupFunction () {...}); 可以看到coGroup只是产生CoGroupedStr...
阅读全文
摘要:Task.run if (invokable instanceof StatefulTask) { StatefulTask op = (StatefulTask) invokable; op.setInitialState(taskStateHandles);} // run the invokableinvokable.invoke(); invokable是StreamT...
阅读全文
摘要:https://docs.google.com/document/d/1Lr9UYXEz6s6R_3PWg3bZQLF3upGaNEkc0rQCFSzaYDI/edit // create the original stream DataStream stream = ...; // apply the async I/O transformation DataStream> re...
阅读全文
摘要:Properties properties = new Properties(); properties.setProperty("bootstrap.servers", "localhost:9092"); // only required for Kafka 0.8 properties.setProperty("zookeeper.connect", "localhost:2181"); p...
阅读全文
摘要:https://ci.apache.org/projects/flink/flink-docs-release-1.3/dev/connectors/kafka.html 使用的方式, DataStream stream = ...; FlinkKafkaProducer010Configuration myProducerConfig = FlinkKafkaProducer010....
阅读全文
摘要:https://calcite.apache.org/docs/stream.html Calcite’s SQL is an extension to standard SQL, not another ‘SQL-like’ language. The distinction is important, for several reasons: Streaming SQL is ...
阅读全文
摘要:对于DataStream,可以选择如下的Strategy, /** * Sets the partitioning of the {@link DataStream} so that the output elements * are broadcasted to every parallel instance of the next operation. ...
阅读全文
摘要:Job资源分配的过程, 在submitJob中,会生成ExecutionGraph 最终调用到, executionGraph.scheduleForExecution(scheduler) 接着,ExecutionGraph public void scheduleForExecution(SlotProvider slotProvider) throws JobException...
阅读全文
摘要:SlotSharingGroup 表示不同的task可以共享slot,但是这是soft的约束,即也可以不在一个slot 默认情况下,整个StreamGraph都会用一个默认的“default” SlotSharingGroup,即所有的JobVertex的task都可以共用一个slot /** * A slot sharing units defines which dif...
阅读全文
摘要:JobManager作为actor, case SubmitJob(jobGraph, listeningBehaviour) => val client = sender() val jobInfo = new JobInfo(client, listeningBehaviour, System.currentTimeMillis(), jo...
阅读全文
摘要:先看最简单的例子, final StreamExecutionEnvironment env = StreamExecutionEnvironment.getExecutionEnvironment(); DataStream> stream = env.addSource(...); stream .map(new MapFunction() {...}) .add...
阅读全文