超时程序管理

遇到一个新的场景:

从两个线上tt拷贝数据到两个线下tt,按照前面写法,两个拷贝的进程是串行执行的,这不满足业务的需求,任何处理呢?

最开始考虑cmd1;cmd2同时提交,但这种方式依然是串行执行,怎么解呢?

最直接的方式和就是两个命令后台执行cmd1&;cmd2&,然后监控进程运行时间,超时就去kill进程,那么如何精确获取进程pid呢?

调研了好久,最后给出解决方案:

def waitPid(name, timeOut):

    for line in psutil.process_iter():#获取当前进程list

        lists= line.cmdline() #获取进程的命令行,类型为list

        Contains="true"

        for key in name:#name为list,起进程时所用变量

            if key not in lists:

                Contains="false"

                break

        if Contains=="true":

            print "pid:", line.pid

            time.sleep(timeOut)

            if line in psutil.process_iter(): #需要判断进程是否还存在,避免不存在的进程kill时抛异常

                cmd = "kill -9 %s"%line.pid #kill进程

                print "cmd:%s"%cmd

                Execute(cmd)

具体使用:

namelist=["com.aliyun.timetunnel.demo.TTReadAndWrite", "member_cart", "0324153555NBJ3YZX0", "e62375c5-6bcb-4bde-900b-0a38c2f6b218", "1490842758", "10000", "member_cart_blink_mufeng", "5fa49387-e8da-461f-a835-abf03f9b9d4c"]

waitPid(namelist,1)#先kill掉相同任务的进程,否则可能新提交的进程处于sleep状态无法执行

cmd = "java -cp utils/blink_test-1.0-shaded.jar com.aliyun.timetunnel.demo.TTReadAndWrite member_cart 0324153555NBJ3YZX0 e62375c5-6bcb-4bde-900b-0a38c2f6b218 1490842758 10000 member_cart_blink_mufeng 5fa49387-e8da-461f-a835-abf03f9b9d4c >aa.log 2>&1 &"

os.system(cmd)

namelist=["com.aliyun.timetunnel.demo.TTReadAndWrite", "member_cart", "0324153555NBJ3YZX0", "e62375c5-6bcb-4bde-900b-0a38c2f6b218", "1490842758", "10000", "member_cart_blink_mufeng", "5fa49387-e8da-461f-a835-abf03f9b9d4c"]

waitPid(namelist,timeOut)

 

工作中常常遇到这种场景:

  例如导数据时上游无数据,任务夯住;执行某个命令时命令夯住等,此时需要加超时判断,如果超过指定时间就退出程序,那怎么写呢?

shell写法:

python写法:

代码如下:

function timeout()

{

    command=$*

    #echo "command:" $command

    waitfor=300

    ( $command ; echo "success" ) &

    commandpid=$!

    ( sleep $waitfor ; kill -9 $commandpid  > /dev/null 2>&1 ) &

 

    watchdog=$!

    sleeppid=$PPID

    wait $commandpid > /dev/null 2>&1

 

    kill $sleeppid > /dev/null 2>&1

}

python写法:

 

import time

from subprocess import Popen, PIPE

import commands

import datetime

def sys_command_outstatuserr(cmd, timeout=120):

    #print "cmd:%s"%cmd

    p = Popen(cmd, stdout=PIPE, stderr=PIPE, shell=True)

    t_beginning = time.time()

    print "t_begging:%s"%t_beginning

    seconds_passed = 0

    while True:

        if p.poll() is not None:

            res = p.communicate()

            exitcode = p.poll() if p.poll() else 0

            #print "out:%s, exitcode:%d, err:%s"%(res[0], exitcode, res[1])

            return res[0], exitcode, res[1]

        seconds_passed = time.time() - t_beginning

        print "seconds_passed:%s"%seconds_passed

        if timeout and seconds_passed > timeout:

            p.terminate()

            out, exitcode, err = '', 128, 'Timeout'

            #print "out:%s, exitcode:%d, err:%s"%(out, exitcode, err)

            return out, exitcode, err

        time.sleep(10)

 

posted @ 2017-03-28 11:00  红诺  阅读(420)  评论(0编辑  收藏  举报