Storm Trident示例ReducerAggregator

ReducerAggregator首先在输入流上运行全局重新分区操作(global)将同一批次的所有分区合并到一个分区中,然后在每个批次上运行的聚合功能,针对Batch操作。

 

省略部分代码,省略部分可参考:https://blog.csdn.net/nickta/article/details/79666918

FixedBatchSpout spout = new FixedBatchSpout(new Fields("user", "score"), 3,    
                new Values("nickt1", 4),   
                new Values("nickt2", 7),    
                new Values("nickt3", 8),   
                new Values("nickt4", 9),    
                new Values("nickt5", 7),   
                new Values("nickt6", 11),   
                new Values("nickt7", 5)   
                );   
        spout.setCycle(false);   
        TridentTopology topology = new TridentTopology();   
        topology.newStream("spout1", spout)   
                .shuffle()   
                .each(new Fields("user", "score"),new Debug("shuffle print:"))  
                .parallelismHint(5)  
                .aggregate(new Fields("score"), new ReducerAggregator<Integer>() {  
                    @Override  
                    //每一个batch,初始调用 1次,空的batch也会调用1次  
                    public Integer init() {  
                        return 0;  
                    }  
                    @Override  
                    //batch看的每个tuple调用1次  
                    public Integer reduce(Integer curr, TridentTuple tuple) {  
                        return curr + tuple.getIntegerByField("score");  
                    }  
                }, new Fields("sum"))//对每个batch中的score求和  
                .each(new Fields("sum"),new Debug("sum print:"))  
                .parallelismHint(5);  

 

输出:

[partition4-Thread-152-b-0-executor[37 37]]> DEBUG(shuffle print:): [nickt3, 8]
[partition0-Thread-68-b-0-executor[33 33]]> DEBUG(shuffle print:): [nickt1, 4]
[partition2-Thread-64-b-0-executor[35 35]]> DEBUG(shuffle print:): [nickt2, 7]
[partition1-Thread-80-b-1-executor[39 39]]> DEBUG(sum print:): [19]
[partition4-Thread-152-b-0-executor[37 37]]> DEBUG(shuffle print:): [nickt4, 9]
[partition2-Thread-64-b-0-executor[35 35]]> DEBUG(shuffle print:): [nickt5, 7]
[partition2-Thread-64-b-0-executor[35 35]]> DEBUG(shuffle print:): [nickt6, 11]
[partition2-Thread-66-b-1-executor[40 40]]> DEBUG(sum print:): [27]
[partition0-Thread-68-b-0-executor[33 33]]> DEBUG(shuffle print:): [nickt7, 5]
[partition3-Thread-56-b-1-executor[41 41]]> DEBUG(sum print:): [5]

posted @ 2018-03-24 20:32  nickt  阅读(120)  评论(0编辑  收藏  举报