Storm storm API(二)
1、Spout
Spout最顶层抽象是ISpout接口。
Open()是初始化方法,为spout提供执行环境,执行器将运行此方法来初始化喷头
nextTuple()循环发射数据,通过收集器发出生成得数据
ack()成功处理tuple回调方法,确认处理特定元组
fail()处理失败tuple回调方法,指定不处理和不重新处理特定元组
activate和deactivate:spout可以被暂时激活和关闭
close方法在该spout关闭前执行,但是并不能得到保证其一定被执行,kill -9 时不执行,Storm kill topoName时执行,如图:
1 public class BaseSpout implements IRichSpout { 2 3 private String inputPath; 4 private SpoutOutputCollector collector; 5 private HashMap<String, String> waitAck = new HashMap<String, String>();
//conf-为此spout提供storm配置
//context-提供有关拓扑中的spout位置,其任务id,输入和输出信息的完整信息
//collector-使我们能够发出将由bolt处理的元组
6 @Override 7 public void open(Map conf, TopologyContext topologyContext, SpoutOutputCollector spoutOutputCollector) { 8 this.collector = spoutOutputCollector; 9 inputPath = (String)conf.get("INPUT_PATH"); 10 } 11 12 @Override 13 public void close() { 14 15 } 16 17 @Override 18 public void activate() { 19 20 } 21 22 @Override 23 public void deactivate() { 24 25 } 26 //从与ack()和fail()方法相同的循环中定期调用,它必须释放线程的控制,当没有工作要做,以便其他方法有机会被调用。因此nextTuple的第一行检查处理是否完成。
//如果是这样,它应该休眠至少一毫秒,以减少处理器在返回之前的负载。 27 @Override 28 public void nextTuple() { 29 Collection<File> listFille = FileUtils.listFiles(new File(inputPath), FileFilterUtils.suffixFileFilter(".xml"),null); 30 //String uuid = UUID.randomUUID().toString().replace("-",""); 31 for(File file:listFille){ 32 BaseXMLHandle.handleXml(file,collector); 33 try { 34 FileUtils.forceDelete(file); 35 } catch (IOException e) { 36 e.printStackTrace(); 37 } 38 } 39 } 40 41 @Override 42 public void ack(Object msgId) { 43 System.out.println("消息发送成功!"); 44 System.out.println("消息处理成功!"); 45 waitAck.remove(msgId); 46 } 47 48 @Override 49 public void fail(Object msgId) { 50 collector.emit(new Values(waitAck.get(msgId)),msgId); 51 } 52 53 @Override 54 public void declareOutputFields(OutputFieldsDeclarer outputFieldsDeclarer) { 55 outputFieldsDeclarer.declare(new Fields("line")); 56 } 57 58 @Override 59 public Map<String, Object> getComponentConfiguration() { 60 return null; 61 } 62 }
其中,fail是处理tuple失败得消息,重新发送,ack机制追踪tuple以及tuple得tuple是否发送成功,msgId是发送消息得id,然而没有消息得内容,需要自己进行缓存
2、Bolt
bolt implements IRichBolt 所包含的API如下所示:
public class BaseBolt implements IRichBolt { @Override public void prepare(Map map, TopologyContext topologyContext, OutputCollector outputCollector) { } @Override public void execute(Tuple tuple) { } @Override public void cleanup() { } @Override public void declareOutputFields(OutputFieldsDeclarer outputFieldsDeclarer) { } @Override public Map<String, Object> getComponentConfiguration() { return null; }
bolt extends BaseBasicBolt其中API如下所示:不需要每次都实现ack机制,storm内部自动进行实现
1 public class BaseBolt extends BaseBasicBolt { 2 @Override 3 public void execute(Tuple tuple, BasicOutputCollector basicOutputCollector) { 4 5 } 6 7 @Override 8 public void declareOutputFields(OutputFieldsDeclarer outputFieldsDeclarer) { 9 10 } 11 }
3、主类提交Topology
1 TopologyBuilder topoBuilder = new TopologyBuilder(); 2 topoBuilder.setSpout("1",new BaseSpout()); 3 //topoBuilder.setBolt("2",new BaseBolt()).shuffleGrouping("1"); 4 topoBuilder.setBolt("3",new HDFSBolt()).shuffleGrouping("1"); 5 Config config = new Config(); 6 config.put("INPUT_PATH","C:\\Users\\hst\\Desktop\\stormdemo"); 7 LocalCluster localCluster = new LocalCluster(); 8 localCluster.submitTopology("mytopo",config,topoBuilder.createTopology()); 9 System.out.println( "Hello World!" );