selenium + chrome 被检测，反反爬小记

主要摘抄自https://www.cnblogs.com/haoabcd2010/p/10552641.html

selenium + chrome

很多难以采集的网站都使用selenium爬取，但是后来发现selenium有特征值，会被检测出来，今天来小结一下反反爬方案
测试网站 https://intoli.com/blog/not-possible-to-block-chrome-headless/chrome-headless-test.html 全绿好像代表没被检测出

中间人修改js

网上很多都是这种博客，不知道靠不靠谱

pyppeteer

这种python的异步请求库，似乎极好的解决了
简书博客 https://www.jianshu.com/p/4dd2737a3048

开发者模式

似乎使用开发者模式可以避免被检测，还需要测试，拼夕夕貌似给绕过去了hhh
[python+selenium代码]

options = webdriver.ChromeOptions()
options.add_experimental_option('excludeSwitches', ['enable-automation'])
driver = webdriver.Chrome(options=option)

打开 chrome 远程调试模式隐藏 selenium 指纹信息

在 cmd 下输入
chrome.exe --remote-debugging-port=9222 --user-data-dir="绝对路径"

然后添加 chrome_options.add_experimental_option('debuggerAddress','127.0.0.1:9222')

补充：最后感觉参考这里靠谱https://www.cnblogs.com/bgmc/p/12154484.html

参考：https://www.cnblogs.com/haoabcd2010/p/10552641.html

https://www.v2ex.com/amp/t/588946

posted on 2020-03-03 17:13 pu369com 阅读(3185) 评论(0) 编辑收藏举报

刷新页面返回顶部

pu369com

selenium + chrome 被检测，反反爬小记

中间人修改js

pyppeteer

开发者模式

导航

公告