220704 wetchat爬虫目标
1、
部门领导提出需求,想要实现微信爬虫。看了下,目前新注册的手机号已经无法登录微信网页版,只能想办法找找其他方法。
在网上搜了下,现在网络上有大佬介绍,微信PC端可以利用psutil (用于获取微信电脑版的进程信息)和pywinauto (用于自动化控制微信电脑版)两个工具,来实现。可以尝试下,具体参考链接:
https://blog.csdn.net/u010835747/article/details/119979088?utm_medium=distribute.pc_relevant.none-task-blog-2~default~baidujs_baidulandingword~default-1-119979088-blog-118603294.pc_relevant_aa2&spm=1001.2101.3001.4242.2&utm_relevant_index=4
https://blog.csdn.net/biggbang/article/details/118603294 python使用pywinauto驱动微信客户端实现公众号爬虫
http://www.manongjc.com/detail/13-npommryytjqsepj.html pywinauto教程
https://blog.csdn.net/wuyoudeyuer/article/details/119811985 pyautogui和pywinauto实现微信客户端爬虫
https://pywinauto.readthedocs.io/en/latest/getting_started.html pywinauto官方文档
https://blog.csdn.net/smart_num_1/article/details/122406466
第四个介绍的pyautogui工具实现的思路,可以借鉴效仿,代码先复制下来:
# -*- coding:utf-8 -*- import psutil import pywinauto from pywinauto.application import Application ''' psutil 用于获取微信电脑版的进程信息 pywinauto 用于自动化控制微信电脑版 ''' def getWechat(): #初始化默认进程 PID = 0 #我们把进程ID来提供给PyWinAuto ,以便于链接微信电脑版 for proc in psutil.process_iter(): try: pinfo = proc.as_dict(attrs=['pid','name']) except psutil.NoSuchProcess: pass else: if 'Wechat.exe' == pinfo['name']: PID = pinfo['pid'] #PyWinAuto实例化并启动应用 app = Application(backend='uia').connect(process= PID) #控制微信电脑版,把朋友圈打开 win = app['微信'] pyq_but = win.child_window(title = '朋友圈',control_type = "Button") pyq_but.draw_outline() cords = pyq_but.rectangle() #点击朋友圈按钮 pywinauto.mouse.click(button = 'left',coords = (cords.left + 10,cords.top + 10)) pyq_win = app['朋友圈'] pyq_win.draw_outline() #获取朋友窗口里面各个控件结构 print(f'打印朋友圈控件结构:{pyq_win.dump_tree()}') if __name__ == '__main__': getWechat()
下午尝试了下用pywinautogui安装及运行,报错ImportError: DLL load failed while importing win32ui: 动态链接库(DLL)初始化例程失败,按照网上的解决办法重新安装了下pywin32==300版本后,能按照示例打开程序了。
pip install --upgrade pywin32==300 --user
参考链接:https://blog.csdn.net/m0_46639364/article/details/107771383?spm=1001.2101.3001.6661.1&utm_medium=distribute.pc_relevant_t0.none-task-blog-2%7Edefault%7ECTRLIST%7Edefault-1-107771383-blog-118547990.pc_relevant_aa2&depth_1-utm_source=distribute.pc_relevant_t0.none-task-blog-2%7Edefault%7ECTRLIST%7Edefault-1-107771383-blog-118547990.pc_relevant_aa2&utm_relevant_index=1 微信公众号爬虫
我修改后的
import pywinauto import psutil from pywinauto.application import Application ''' psutil 用于获取微信电脑版的进程信息 pywinauto 用于自动化控制微信电脑版 ''' #app = Application(backend='uia').start('notepad.exe') def find_PID(): PID = 0 print("获取微信进程id") for proc in psutil.process_iter(): try: pinfo = proc.as_dict(attrs=['pid', 'name']) except psutil.NoSuchProcess: print("没有打开微信") pass else: if 'WeChat.exe' == pinfo['name']: PID = pinfo['pid'] print("找到微信进程") return PID PID = find_PID() print("用PyWinAuto 实例化一个应用") app = Application(backend='uia').connect(process=PID) print("打开微信朋友圈") win = app['微信'] pyq_but = win.child_window(title = '朋友圈',control_type = "Button") pyq_but.draw_outline() cords = pyq_but.rectangle() #点击朋友圈按钮 pywinauto.mouse.click(button = 'left',coords = (cords.left + 10,cords.top + 10)) pyq_win = app['朋友圈'] pyq_win.draw_outline() #获取朋友窗口里面各个控件结构 print(f'打印朋友圈控件结构:{pyq_win.dump_tree()}')
0715
考虑了下学习成本与进度,决定短期三四个月内先用pywinauto实现wetchat消息监控,后续在未来一年中学习实现hook注入wechat。