Flink CEP实例及基础应用

CEP(Complex Event Processing)就是在无界事件流中检测事件模式,使能够掌握数据中重要的部分。

1>.输入数据流的创建
2>.模式(Pattern)定义
3>.Pattern应用在事件流上的检测
4>.选取结果

3.常用的个体连续连续模式:

严格连续模式,松散连续,不确定的松散连续。当然还有严格连续的NOT模式和松散连续的NOT模式,这两种并不常用,下面代码举例说明常用的三种模式
flink CEP编程需要导入的lib包

<dependency>
      <groupId>org.apache.flink</groupId>
      <artifactId>flink-cep_2.11</artifactId>
      <version>${flink.version}</version>
</dependency>
package org.stsffap.cep.monitoring;

import org.apache.flink.cep.CEP;
import org.apache.flink.cep.pattern.Pattern;
import org.apache.flink.cep.pattern.conditions.IterativeCondition;
import org.apache.flink.streaming.api.datastream.DataStream;
import org.apache.flink.streaming.api.environment.StreamExecutionEnvironment;

public class MyCEPTest {
    public static void main(String args[]) throws Exception {
        final StreamExecutionEnvironment env = StreamExecutionEnvironment.getExecutionEnvironment();

        DataStream<String> dataStream = env.fromElements(("a"), ("c"), ("b1"), ("b2"));

        /*---------严格连续模式----------------------*/
        Pattern strictPattern = Pattern.begin("start").where(new IterativeCondition<Object>() {
            @Override
            public boolean filter(Object s, Context<Object> context) {
                return s.toString().equalsIgnoreCase("a");
            }
        }).next("middle").where(new IterativeCondition<Object>() {
            @Override
            public boolean filter(Object o, Context<Object> context) {
                return o.toString().contains("b");
            }
        });

        CEP.pattern(dataStream, strictPattern).select(map -> {
            System.out.println("strictPattern:" + map.get("start").toString());
            System.out.println("strictPattern:" + map.get("middle").toString());
            return map;
        }).print();
        /*---------------------------------------------*/

        /*---------松散连续----------------------*/
        Pattern relaxedPattern = Pattern.begin("start").where(new IterativeCondition<Object>() {
            @Override
            public boolean filter(Object s, Context<Object> context) {
                return s.toString().equalsIgnoreCase("a");
            }
        }).followedBy("middle").where(new IterativeCondition<Object>() {
            @Override
            public boolean filter(Object o, Context<Object> context) {
                return o.toString().contains("b");
            }
        });

        CEP.pattern(dataStream, relaxedPattern).select(map -> {
            System.out.println("relaxedPattern:" + map.get("start").toString());
            System.out.println("relaxedPattern:" + map.get("middle").toString());
            return map;
        }).print();
        /*---------------------------------------------*/


        /*---------不确定的松散连续----------------------*/
        Pattern nonDeterminPattern = Pattern.begin("start").where(new IterativeCondition<Object>() {
            @Override
            public boolean filter(Object s, Context<Object> context) {
                return s.toString().equalsIgnoreCase("a");
            }
        }).followedByAny("middle").where(new IterativeCondition<Object>() {
            @Override
            public boolean filter(Object o, Context<Object> context) {
                return o.toString().contains("b");
            }
        });

        CEP.pattern(dataStream, nonDeterminPattern).select(map -> {
            System.out.println("nonDeterminPattern:" + map.get("start").toString());
            System.out.println("nonDeterminPattern:" + map.get("middle").toString());
            return map;
        }).print();
        /*---------------------------------------------*/

        env.execute("Flink CEP Test");
    }
}

输出结果

nonDeterminPattern:[a]
nonDeterminPattern:[b1]
relaxedPattern:[a]
relaxedPattern:[b1]
nonDeterminPattern:[a]
nonDeterminPattern:[b2]
2> {start=[a], middle=[b2]}
1> {start=[a], middle=[b1]}
1> {start=[a], middle=[b1]}

可以看出严格的连续模式并没有输出结果,因为a和b之间有c,而松散连续输出的结果为(a,b1),不确定的松散连续(a,b1),(a,b2)

4.组合模式举例

上面举例只说明的个体模式较为简单,现在举例说明一个稍微复杂的组合模式举例
a b+c模式:a和b之间是松散连续,b和c之间是严格连续

		DataStream<String> dataStream = env.fromElements(("a"), ("b1"), ("d1"), ("b2"),("d2"),("b3"),("c"));

        //a b+c模式:a和b之间是松散连续,b和c之间是严格连续
        Pattern pattern = Pattern.begin("start").where(new IterativeCondition<Object>() {
            @Override
            public boolean filter(Object s, Context<Object> context) {
                return s.toString().equalsIgnoreCase("a");
            }
        }).followedBy("middle").where(new IterativeCondition<Object>() {
            @Override
            public boolean filter(Object o, Context<Object> context) {
                return o.toString().contains("b");
            }
        }).oneOrMore().next("last").where(new IterativeCondition<Object>() {
            @Override
            public boolean filter(Object o, Context<Object> context) {
                return o.toString().contains("c");
            }
        });

        CEP.pattern(dataStream, pattern).select(map -> {
            System.out.println("pattern:" + map.get("start").toString());
            System.out.println("pattern:" + map.get("middle").toString());
            System.out.println("pattern:" + map.get("last").toString());
            return map;
        }).print();

输出结果为

pattern:[a]
pattern:[b1, b2, b3]
pattern:[c]
1> {start=[a], middle=[b1, b2, b3], last=[c]}

//a+b c模式:a和b之间是严格连续,b和c之间是松散连续

		DataStream<String> dataStream = env.fromElements(("a"), ("b1"), ("d1"), ("b2"),("d2"),("b3"),("c"));
		//a+b c模式:a和b之间是严格连续,b和c之间是松散连续
        Pattern pattern = Pattern.begin("start").where(new IterativeCondition<Object>() {
            @Override
            public boolean filter(Object s, Context<Object> context) {
                return s.toString().equalsIgnoreCase("a");
            }
        }).next("middle").where(new IterativeCondition<Object>() {
            @Override
            public boolean filter(Object o, Context<Object> context) {
                return o.toString().contains("b");
            }
        }).oneOrMore().followedBy("last").where(new IterativeCondition<Object>() {
            @Override
            public boolean filter(Object o, Context<Object> context) {
                return o.toString().contains("c");
            }
        });

        CEP.pattern(dataStream, pattern).select(map -> {
            System.out.println("--------------------------------------");
            System.out.println("pattern:" + map.get("start").toString());
            System.out.println("pattern:" + map.get("middle").toString());
            System.out.println("pattern:" + map.get("last").toString());
            return map;
        }).print();

输出结果为:

--------------------------------------
pattern:[a]
pattern:[b1, b2, b3]
pattern:[c]
--------------------------------------
pattern:[a]
pattern:[b1, b2]
pattern:[c]
--------------------------------------
pattern:[a]
pattern:[b1]
pattern:[c]
3> {start=[a], middle=[b1], last=[c]}
1> {start=[a], middle=[b1, b2, b3], last=[c]}
2> {start=[a], middle=[b1, b2], last=[c]}


flink CEP在实时流数据处理应用中并不仅仅上面介绍的这么简单,还有更多复杂的应用,具体可参照flink官方(https://ci.apache.org/projects/flink/flink-docs-release-1.11/zh/dev/libs/cep.html)。

posted @ 2020-12-01 14:23  技术即艺术  阅读(1056)  评论(0编辑  收藏  举报