oozie fork join结点

oozie可以用fork和join节点进行多任务并行处理,同时fork和join也是同时出现,缺一不可.

语法:

<workflow-app name="[WF-DEF-NAME]" xmlns="uri:oozie:workflow:0.1">
    ...
    <fork name="[FORK-NODE-NAME]">
        <path start="[NODE-NAME]" />
        ...
        <path start="[NODE-NAME]" />
    </fork>
    ...
    <join name="[JOIN-NODE-NAME]" to="[NODE-NAME]" />
    ...
</workflow-app>

官网给出的例子:

<workflow-app name="sample-wf" xmlns="uri:oozie:workflow:0.1">
    ...
    <fork name="forking">
        <path start="firstparalleljob"/>
        <path start="secondparalleljob"/>
    </fork>
    <action name="firstparallejob">
        <map-reduce>
            <job-tracker>foo:8021</job-tracker>
            <name-node>bar:8020</name-node>
            <job-xml>job1.xml</job-xml>
        </map-reduce>
        <ok to="joining"/>
        <error to="kill"/>
    </action>
    <action name="secondparalleljob">
        <map-reduce>
            <job-tracker>foo:8021</job-tracker>
            <name-node>bar:8020</name-node>
            <job-xml>job2.xml</job-xml>
        </map-reduce>
        <ok to="joining"/>
        <error to="kill"/>
    </action>
    <join name="joining" to="nextaction"/>
    ...
</workflow-app>

工作时写的:

<workflow-app  name="java-example1" xmlns="uri:oozie:workflow:0.5">  
    <start to="forking"/> 
    <fork name="forking">
        <path start="firstparalleljob"/>
        <path start="secondparalleljob"/>
    </fork>    
    <action name="firstparalleljob">
       <shell xmlns="uri:oozie:shell-action:0.1">
            <job-tracker>${jobTracker}</job-tracker>
            <name-node>${nameNode}</name-node>
            <configuration>
                <property>
                  <name>mapred.job.queue.name</name>
                  <value>${queueName}</value>
                </property>
            </configuration>
            <exec>java</exec>
            <argument>-cp</argument>
            <argument>test1.OzzieTest1</argument>
            <argument>-jar</argument>
            <argument>test.jar</argument>
        </shell>
        <ok to="joining"/>
        <error to="fail"/>    
    </action> 
    <action name="secondparalleljob">
      <shell xmlns="uri:oozie:shell-action:0.1">
            <job-tracker>${jobTracker}</job-tracker>
            <name-node>${nameNode}</name-node>
            <configuration>
                <property>
                  <name>mapred.job.queue.name</name>
                  <value>${queueName}</value>
                </property>
            </configuration>
            <exec>java</exec>
            <argument>-cp</argument>
            <argument>test1.OzzieTest</argument>
            <argument>-jar</argument>
            <argument>test.jar</argument>
        </shell>
        <ok to="joining"/>
        <error to="fail"/>    
    </action>   
    <join name="joining" to="end"/>
      <kill name="fail">  
       <message>Action failed, error message[${wf:errorMessage(wf:lastErrorNode())}]</message>  
    </kill>  
   <end name="end"/>  
</workflow-app> 

fork节点把任务切分成多个并行任务,join则合并多个并行任务。fork和join节点必须是成对出现的。join节点合并的任务,必须是通一个fork出来的子任务才行。

 

posted on 2017-09-04 15:51  风景1573  阅读(2162)  评论(0编辑  收藏  举报