Flink (三) Flink 编程模型
Flink (三) Flink 编程模型
流式处理WordCount:
public class StreamWordCount {
public static void main(String[] args) throws Exception {
//创建一个流处理的执行环境
StreamExecutionEnvironment env = StreamExecutionEnvironment.getExecutionEnvironment();
//接受socket数据流
DataStreamSource<String> textDataSteam = env.socketTextStream("localhost",7777);
//逐一读取数据,打散之后进行WordCount(逻辑计算)
SingleOutputStreamOperator<Tuple2<String, Integer>> wordCountDataStream = textDataSteam
.flatMap(new FlatMapFunction<String, Tuple2<String, Integer>>() {
public void flatMap(String s, Collector<Tuple2<String, Integer>> collector) throws Exception {
String[] tokens = s.split(" ");
for (String token : tokens) {
if (token.length() > 0) {
collector.collect(new Tuple2<String, Integer>(token, 1));
}
}
}
})
.filter(new FilterFunction<Tuple2<String, Integer>>() {
public boolean filter(Tuple2<String, Integer> stringIntegerTuple2) throws Exception {
if (stringIntegerTuple2.equals(null)) {
return false;
}
return true;
}
})
.keyBy(0)
.sum(1);
//打印输出
wordCountDataStream.print();
//执行任务
env.execute("StreamWordCountJob");
//测试需要开启端口7777
}
}
整个Flink程序一共分为5步:设定Flink执行环境、创建和加载数据集、对数据集指定转换操作逻辑、指定计算结果输出位置