Fork me on GitHub

python3用selinium批量采集cacti流量图

  selinium是一个用于Web应用程序测试的工具。Selenium测试直接运行在浏览器中,就像真正的用户在操作一样。支持的浏览器包括IE(7, 8, 9, 10, 11),Mozilla Firefox,Safari,Google Chrome,Opera等。学习python爬虫基础的人,都会接触到这个selinium框架。

  一、首先,当然是下载selinium模块,前提你已经下载了python3,还有python编辑器,比如pycharm,IDLE,Visual Studio Code等等,还有很多python编辑器,详情可查看该链接:https://baijiahao.baidu.com/s?id=1620388483830154843&wfr=spider&for=pc,博主使用的是pycharm编辑器。

  二、因为selinium框架是运行在浏览器上的,所以要先下载好浏览器对应的各浏览器驱动。一般都是用谷歌、火狐、IE浏览器,对应的浏览器驱动可以查看该链接:https://www.cnblogs.com/momolei/p/10118526.html,注意:不同的浏览器的版本对应的xxx.exe 版本也不一样,这个很重要哦。下载好的xxx.exe应该放到python3目录下。

  如果以上步骤都已经弄好,在CMD黑窗口下载selinium:pip install selinium。

  三、然后,可以在python编辑器上,调试是否可以利用selinium框架打开浏览器。比如:

from selenium import webdriver

#设置chromedriver
browser = webdriver.Chrome("C:\Program Files (x86)\Google\Chrome\Application\chromedriver.exe")
#设置超时时间
browser.set_page_load_timeout(10)
#打开百度网页
browser.get("https://www.baidu.com")
print(browser.page_source)

  如果能够看到已经打开百度,正常返回了内容,说明你已经成功了50%,安装好了selinium模块后,就可以进行cacti流量图的爬取了。

  四、好了,可以进行正文代码部分了。

  

from selenium import webdriver
from lxml import etree
import time
import datetime


driver =webdriver.Chrome(r'D:\python3.7\chromedriver.exe')
driver.get('cacti的IP地址,比如http://xxx/graph_view.php')
name = driver.find_element_by_name("login_username")
passwd = driver.find_element_by_name("login_password")
name.send_keys('登录账号')
passwd.send_keys('登录密码')
submit = driver.find_element_by_xpath('//td/input[@value="登录"]')
submit.click()
i = 6
for i in range(6,-1,-1):
    if i == 6:
        threeDayAgo = (datetime.datetime.now() - datetime.timedelta(days = i))
        otherStyleTime = threeDayAgo.strftime("%Y-%m-%d %H:%M:%S")
        format_otherStyleTime1 = "%s 06:00:00" % otherStyleTime.split()[0]
        format_otherStyleTime2 = "%s 10:00:00" % otherStyleTime.split()[0]
        format_otherStyleTime3 = "%s 18:00:00" % otherStyleTime.split()[0]
        format_otherStyleTime4 = "%s 22:00:00" % otherStyleTime.split()[0]
        list1= [format_otherStyleTime1,format_otherStyleTime2,format_otherStyleTime3,format_otherStyleTime4]
    elif i==5:
        threeDayAgo = (datetime.datetime.now() - datetime.timedelta(days=i))
        otherStyleTime = threeDayAgo.strftime("%Y-%m-%d %H:%M:%S")
        format_otherStyleTime1 = "%s 06:00:00" % otherStyleTime.split()[0]
        format_otherStyleTime2 = "%s 10:00:00" % otherStyleTime.split()[0]
        format_otherStyleTime3 = "%s 18:00:00" % otherStyleTime.split()[0]
        format_otherStyleTime4 = "%s 22:00:00" % otherStyleTime.split()[0]
        list2=[format_otherStyleTime1,format_otherStyleTime2,format_otherStyleTime3,format_otherStyleTime4]
    elif i==4:
        threeDayAgo = (datetime.datetime.now() - datetime.timedelta(days=i))
        otherStyleTime = threeDayAgo.strftime("%Y-%m-%d %H:%M:%S")
        format_otherStyleTime1 = "%s 06:00:00" % otherStyleTime.split()[0]
        format_otherStyleTime2 = "%s 10:00:00" % otherStyleTime.split()[0]
        format_otherStyleTime3 = "%s 18:00:00" % otherStyleTime.split()[0]
        format_otherStyleTime4 = "%s 22:00:00" % otherStyleTime.split()[0]
        list3 = [format_otherStyleTime1, format_otherStyleTime2, format_otherStyleTime3, format_otherStyleTime4]
    elif i == 3:
        threeDayAgo = (datetime.datetime.now() - datetime.timedelta(days=i))
        otherStyleTime = threeDayAgo.strftime("%Y-%m-%d %H:%M:%S")
        format_otherStyleTime1 = "%s 06:00:00" % otherStyleTime.split()[0]
        format_otherStyleTime2 = "%s 10:00:00" % otherStyleTime.split()[0]
        format_otherStyleTime3 = "%s 18:00:00" % otherStyleTime.split()[0]
        format_otherStyleTime4 = "%s 22:00:00" % otherStyleTime.split()[0]
        list4 = [format_otherStyleTime1, format_otherStyleTime2, format_otherStyleTime3, format_otherStyleTime4]
    elif i == 2:
        threeDayAgo = (datetime.datetime.now() - datetime.timedelta(days=i))
        otherStyleTime = threeDayAgo.strftime("%Y-%m-%d %H:%M:%S")
        format_otherStyleTime1 = "%s 06:00:00" % otherStyleTime.split()[0]
        format_otherStyleTime2 = "%s 10:00:00" % otherStyleTime.split()[0]
        format_otherStyleTime3 = "%s 18:00:00" % otherStyleTime.split()[0]
        format_otherStyleTime4 = "%s 22:00:00" % otherStyleTime.split()[0]
        list5 = [format_otherStyleTime1, format_otherStyleTime2, format_otherStyleTime3, format_otherStyleTime4]
    elif i == 1:
        threeDayAgo = (datetime.datetime.now() - datetime.timedelta(days=i))
        otherStyleTime = threeDayAgo.strftime("%Y-%m-%d %H:%M:%S")
        format_otherStyleTime1 = "%s 06:00:00" % otherStyleTime.split()[0]
        format_otherStyleTime2 = "%s 10:00:00" % otherStyleTime.split()[0]
        format_otherStyleTime3 = "%s 18:00:00" % otherStyleTime.split()[0]
        format_otherStyleTime4 = "%s 22:00:00" % otherStyleTime.split()[0]
        list6 = [format_otherStyleTime1, format_otherStyleTime2, format_otherStyleTime3, format_otherStyleTime4]
    elif i == 0:
        threeDayAgo = (datetime.datetime.now() - datetime.timedelta(days=i))
        otherStyleTime = threeDayAgo.strftime("%Y-%m-%d %H:%M:%S")
        format_otherStyleTime1 = "%s 06:00:00" % otherStyleTime.split()[0]
        format_otherStyleTime2 = "%s 10:00:00" % otherStyleTime.split()[0]
        format_otherStyleTime3 = "%s 18:00:00" % otherStyleTime.split()[0]
        format_otherStyleTime4 = "%s 22:00:00" % otherStyleTime.split()[0]
        threeDayAgo = (datetime.datetime.now() - datetime.timedelta(days=i-1))
        otherStyleTime1 = threeDayAgo.strftime("%Y-%m-%d %H:%M:%S")
        format_otherStyleTime5 = "%s 06:00:00" % otherStyleTime1.split()[0]

        list7 = [format_otherStyleTime1, format_otherStyleTime2, format_otherStyleTime3, format_otherStyleTime4,format_otherStyleTime5]
        list = (list1+list2+list3+list4+list5+list6+list7)
    else:
        break
for i in range(0,30):
    driver.find_element_by_name("date1").clear()  # 调用clear()方法去清除
    driver.find_element_by_name("date2").clear()
    driver.find_element_by_name("date1").send_keys(list[i])
    driver.find_element_by_name("date2").send_keys(list[i+1])
    button = driver.find_element_by_name("button_refresh_x").click()
    a = driver.find_element_by_xpath(".//tbody/tr[4]/td//table/tbody/tr/td[2]/a/img").click()
    picture_list=('%s %s'%(i,'.jpg'))
    driver.save_screenshot(picture_list)
    b = driver.find_element_by_xpath('.//tbody/tr/td//a[2]').click()
    driver.close()

  一气呵成,可以看到这个py文件下就有了你想要的流量图。我采集的流量图时间间断是根据我工作所需的要求,小伙胖可以根据自己需要的时间段进行修改。

  我觉得中间那个日期的for循环,应该是可以简单点的,但是目前还没有想到怎么优化这段代码,后续有优化再更新博文。小伙伴如果有更好的想法也可以私聊我哦。

  如需转载,请附带原创链接,感谢!

 

  

 

  

posted @ 2019-10-23 18:03  python终极者  阅读(568)  评论(0编辑  收藏  举报
AmazingCounters.com
页脚Html代码