selenium学习笔记13-解决验证码问题
方案一:使用pytesseract和pillow实现验证码识别
可以使用pytesseract 模块和 PIL模块解决不太复杂的验证码问题,实现步骤如下:
1、安装pytesseract -》pip install pytesseract
2、安装PIL模块 -》pip install pil
仅可识别简单的验证码
代码如下:
import time from selenium import webdriver from PIL import Image import pytesseract import unittest def test1(): #打开谷歌浏览器 browser = webdriver.Chrome() #打开首页 browser.get("http://localhost:8080/jpress/user/register") browser.maximize_window() #获取验证码图片 t = time.time() picture_name1 = str(t)+'.png' browser.save_screenshot(picture_name1) ce = browser.find_element_by_id("captchaimg") print(ce.location) left = ce.location['x'] top = ce.location['y'] right = ce.size['width'] + left height = ce.size['height'] + top im = Image.open(picture_name1) # 抠图 img = im.crop((left,top,right, height)) t = time.time() picture_name2 = str(t)+'.png' img.save(picture_name2)#这里就是截取到的验证码图片 browser.close() def test2(): image1 = Image.open('test.png') str = pytesseract.image_to_string(image1) print(str)
方案二:通过第三方AI库识别验证码
对于复杂的验证码,可以使用第三方的api来实现,万维易源的API来解决验证码问题,网址是:https://www.showapi.com/
识别验证码的地址是:https://www.showapi.com/apiGateway/view?apiCode=184
代码如下:
from lib.ShowapiRequest import ShowapiRequest def test01(): r = ShowapiRequest("http://route.showapi.com/184-4","547473","741edd63a66b44159bfd7f7bb23ce60a" ) r.addFilePara("image", "test.png") r.addBodyPara("typeId", "34") r.addBodyPara("convert_to_jpg", "0") r.addBodyPara("needMorePrecise", "0") res = r.post() result = res.text print(result) body = res.json()['showapi_res_body'] print(body['Result']) # print(res.text) # 返回信息