随笔分类 -  Apache Flink

上一页 1 2 3 下一页

Flink – WindowedStream
摘要:在WindowedStream上可以执行,如reduce,aggregate,min,max等操作 关键是要理解windowOperator对KVState的运用,因为window是用它来存储window buffer的 采用不同的KVState,会有不同的效果,如ReduceState,ListState Reduce /** * Applies the gi... 阅读全文

posted @ 2017-03-21 17:27 fxjwind 阅读(2575) 评论(0) 推荐(0) 编辑

Flink - watermark生成
摘要:参考,Flink - Generating Timestamps / Watermarks watermark,只有在有window的情况下才用到,所以在window operator前加上assignTimestampsAndWatermarks即可 不一定需要从source发出 1. 首先,so 阅读全文

posted @ 2017-03-16 18:07 fxjwind 阅读(4667) 评论(0) 推荐(0) 编辑

Flink – metrics V1.2
摘要:WebRuntimeMonitor .GET("/jobs/:jobid/vertices/:vertexid/metrics", handler(new JobVertexMetricsHandler(metricFetcher))) .GET("/jobs/:jobid/metrics", handler(new JobMetricsHandler(metricFetcher))) ... 阅读全文

posted @ 2017-02-15 15:27 fxjwind 阅读(1917) 评论(1) 推荐(0) 编辑

Apache Common Math Stat
摘要:http://commons.apache.org/proper/commons-math/userguide/stat.html mark DescriptiveStatistics maintains the input data in memory and has the capability of producing "rolling" statistics computed f... 阅读全文

posted @ 2017-02-14 12:27 fxjwind 阅读(676) 评论(0) 推荐(0) 编辑

Flink – submitJob
摘要:Jobmanager的submitJob逻辑, /** * Submits a job to the job manager. The job is registered at the libraryCacheManager which * creates the job's class loader. The job graph is appended to the corr... 阅读全文

posted @ 2017-02-10 14:22 fxjwind 阅读(3398) 评论(0) 推荐(0) 编辑

Flink - TypeInformation
摘要:Flink 自己创建一套独立的类型系统, 参考, https://ci.apache.org/projects/flink/flink-docs-release-0.10/internals/types_serialization.html 为何要自己搞一套,而不像其他的平台一样让编程语言或serialization framework来天然做掉? Flink tries to know a... 阅读全文

posted @ 2017-01-17 19:09 fxjwind 阅读(1851) 评论(0) 推荐(0) 编辑

Flink 1.1 – ResourceManager
摘要:Flink resource manager的作用如图, FlinkResourceManager /** * * Worker allocation steps * * * The resource manager decides to request more workers. This can happen in order * t... 阅读全文

posted @ 2016-12-30 15:32 fxjwind 阅读(1908) 评论(9) 推荐(0) 编辑

Flink - InstanceManager
摘要:InstanceManager用于管理JobManager申请到的taskManager和slots资源 /** * Simple manager that keeps track of which TaskManager are available and alive. */ public class InstanceManager { // ---------------... 阅读全文

posted @ 2016-12-30 15:32 fxjwind 阅读(550) 评论(0) 推荐(0) 编辑

Flink - NetworkEnvironment
摘要:NetworkEnvironment 是一个TaskManager对应一个,而不是一个task对应一个 其中最关键的是networkBufferPool, operator产生的中间结果,ResultPartition,或是input数据,InputGate 都是需要memory来暂存的,这就需要networkBufferPool来管理这部分内存 /** * Networ... 阅读全文

posted @ 2016-12-09 16:51 fxjwind 阅读(1615) 评论(0) 推荐(1) 编辑

Flink – window operator
摘要:参考, http://wuchong.me/blog/2016/05/25/flink-internals-window-mechanism/ http://wuchong.me/blog/2016/06/06/flink-internals-session-window/ WindowOperator window operator通过WindowAssigner和Tr... 阅读全文

posted @ 2016-12-06 14:52 fxjwind 阅读(1846) 评论(0) 推荐(0) 编辑

Flink – Trigger,Evictor
摘要:org.apache.flink.streaming.api.windowing.triggers; Trigger public abstract class Trigger implements Serializable { /** * Called for every element that gets added to a pane. The resul... 阅读全文

posted @ 2016-12-02 16:29 fxjwind 阅读(3410) 评论(0) 推荐(0) 编辑

Flink - RocksDBStateBackend
摘要:如果要考虑易用性和效率,使用rocksDB来替代普通内存的kv是有必要的 有了rocksdb,可以range查询,可以支持columnfamily,可以各种压缩 但是rocksdb本身是一个库,是跑在RocksDBStateBackend中的 所以taskmanager挂掉后,数据还是没了, 所以RocksDBStateBackend仍然需要类似HDFS这样的分布式存储来存储snapsho... 阅读全文

posted @ 2016-11-29 16:49 fxjwind 阅读(6634) 评论(0) 推荐(0) 编辑

Flink - state管理
摘要:在Flink – Checkpoint 没有描述了整个checkpoint的流程,但是对于如何生成snapshot和恢复snapshot的过程,并没有详细描述,这里补充 StreamOperator /** * Basic interface for stream operators. Implementers would implement one of * {@link ... 阅读全文

posted @ 2016-11-25 23:20 fxjwind 阅读(2433) 评论(0) 推荐(0) 编辑

Flink - state
摘要:public class StreamTaskState implements Serializable, Closeable { private static final long serialVersionUID = 1L; private StateHandle operatorState; private StateHandle funct... 阅读全文

posted @ 2016-11-25 23:19 fxjwind 阅读(1381) 评论(0) 推荐(0) 编辑

Flink -- Failover
摘要:JobManager failover LeaderLatch private synchronized void setLeadership(boolean newValue){ boolean oldValue = hasLeadership.getAndSet(newValue); if ( oldValue && !newValue ) //原来是leader,... 阅读全文

posted @ 2016-11-24 19:46 fxjwind 阅读(1753) 评论(0) 推荐(0) 编辑

Stream Processing for Everyone with SQL and Apache Flink
摘要:Where did we come from? With the 0.9.0-milestone1 release, Apache Flink added an API to process relational data with SQL-like expressions called the Table API. The central concept of this API is a Ta... 阅读全文

posted @ 2016-11-21 13:14 fxjwind 阅读(730) 评论(0) 推荐(0) 编辑

Flink -- Barrier
摘要:CheckpointBarrierHandler 这个接口用于react从input channel过来的checkpoint barrier,这里可以通过不同的实现来,决定是简单的track barriers,还是要去真正的block inputs BarrierBuffer 最关键的函数, 其中 阅读全文

posted @ 2016-11-19 00:13 fxjwind 阅读(2030) 评论(0) 推荐(0) 编辑

Flink - Checkpoint
摘要:Flink在流上最大的特点,就是引入全局snapshot, CheckpointCoordinator 做snapshot的核心组件为, CheckpointCoordinator /** * The checkpoint coordinator coordinates the distributed snapshots of operators and state. * I... 阅读全文

posted @ 2016-11-19 00:11 fxjwind 阅读(4896) 评论(0) 推荐(0) 编辑

Flink - metrics
摘要:Metrics是以MetricsGroup来组织的 MetricGroup MetricGroup 这就是个metric容器,里面可以放subGroup,或者各种metric 所以主要的接口就是注册, /** * A MetricGroup is a named container for {@link Metric Metrics} and further metric ... 阅读全文

posted @ 2016-11-10 16:19 fxjwind 阅读(4018) 评论(1) 推荐(0) 编辑

Flink - FLIP
摘要:https://cwiki.apache.org/confluence/display/FLINK/Flink+Improvement+Proposals FLIP-1 : Fine Grained Recovery from Task Failures When a task fails duri 阅读全文

posted @ 2016-10-13 11:59 fxjwind 阅读(1072) 评论(0) 推荐(1) 编辑

上一页 1 2 3 下一页