datastage tips 二:使用crontab+shell定时调度datastage job
在使用datastage抽取数据的过程中,经常会遇到要求以准实时(如每隔3分钟同步一次数据)进行数据抽取的需求。这些要求可以使用crontab+shell来实现。如下图所示,此为整个shell调度DS JOB的流程:首先使用dsjob的jobinfo参数获取该JOB的进程号(process_id)和状态值(state).当state异常时,重置该JOB;当state正常时,判断该job是否在运行(即process_id是否为0)。若process_id=0,表示目前该JOB未运行,正常调度该JOB,若process_id>0,表示该JOB正在运行阶段,转置结束处。
代码如下:
#!/bin/sh . /home/dsadm/.bash_profile APT_CONFIG_FILE=/opt/IBM/InformationServer/Server/Configurations/default.apt PROJECT_NAME=$1 JOB_NAME=$2 #job stat flag status=1 job_process_id=1 cd $DSHOME/bin #check job status job_process_id=`./dsjob -jobinfo "$PROJECT_NAME" "$JOB_NAME"|sed -n '10,1p'|sed 's/\(.*\)\([:]\)\([ ]\{1,\}\)\([0-9]\{1,\}\)/\4/g'` ./dsjob -jobinfo $PROJECT_NAME $JOB_NAME echo "job_process_id=dsjob -jobinfo $PROJECT_NAME $JOB_NAME" status=`./dsjob -jobinfo $PROJECT_NAME $JOB_NAME|sed -n '1,1p'|sed 's/\(.*\)\([:]\)\([ ]\{1,\}\)\(.*\)/\4/g'|grep FAILED|wc -l` echo "status=dsjob -jobinfo $PROJECT_NAME $JOB_NAME" if [ $status -ne 0 ]; then ./dsjob -run -mode RESET $PROJECT_NAME $JOB_NAME echo "Invoke: dsjob -run -mode RESET $PROJECT_NAME $JOB_NAME" else if [ "$job_process_id" -eq 0 ]; then ./dsjob -run -param \$APT_CONFIG_FILE="$APT_CONFIG_FILE" $PROJECT_NAME $JOB_NAME fi fi echo "Finished"
然后,通过crontab进行上述脚本的调度,即可。