datastage tips 二:使用crontab+shell定时调度datastage job

  在使用datastage抽取数据的过程中,经常会遇到要求以准实时(如每隔3分钟同步一次数据)进行数据抽取的需求。这些要求可以使用crontab+shell来实现。如下图所示,此为整个shell调度DS JOB的流程:首先使用dsjob的jobinfo参数获取该JOB的进程号(process_id)和状态值(state).当state异常时,重置该JOB;当state正常时,判断该job是否在运行(即process_id是否为0)。若process_id=0,表示目前该JOB未运行,正常调度该JOB,若process_id>0,表示该JOB正在运行阶段,转置结束处。

代码如下:

#!/bin/sh
. /home/dsadm/.bash_profile


APT_CONFIG_FILE=/opt/IBM/InformationServer/Server/Configurations/default.apt
PROJECT_NAME=$1
JOB_NAME=$2

#job stat flag
status=1
job_process_id=1

cd $DSHOME/bin
#check job status
  job_process_id=`./dsjob -jobinfo "$PROJECT_NAME" "$JOB_NAME"|sed -n '10,1p'|sed 's/\(.*\)\([:]\)\([ ]\{1,\}\)\([0-9]\{1,\}\)/\4/g'`
  ./dsjob -jobinfo $PROJECT_NAME $JOB_NAME
  echo "job_process_id=dsjob -jobinfo $PROJECT_NAME $JOB_NAME"
  status=`./dsjob -jobinfo $PROJECT_NAME $JOB_NAME|sed -n '1,1p'|sed 's/\(.*\)\([:]\)\([ ]\{1,\}\)\(.*\)/\4/g'|grep FAILED|wc -l`
  echo "status=dsjob -jobinfo $PROJECT_NAME $JOB_NAME"
if [ $status -ne 0 ]; then
   ./dsjob -run -mode RESET $PROJECT_NAME $JOB_NAME
   echo "Invoke: dsjob -run -mode RESET $PROJECT_NAME $JOB_NAME" 
else
  if [ "$job_process_id" -eq 0 ]; then
        ./dsjob -run -param \$APT_CONFIG_FILE="$APT_CONFIG_FILE" $PROJECT_NAME $JOB_NAME
  fi
fi

echo "Finished"

然后,通过crontab进行上述脚本的调度,即可。

posted on 2013-01-30 17:17  gobird  阅读(3690)  评论(2编辑  收藏  举报

导航