离线OCR
1.下载最新版 https://digi.bib.uni-mannheim.de/tesseract/
2.安装后添加环境变量,成功后测试tesseract -v
4.自带英文识别包:eng.traineddata,下载识别额外所需语言包,https://github.com/tesseract-ocr/tessdata,比如chi_sim.traineddata
5.安装python第三方库pip install pytesseract
6.识别示例
from PIL import Image import pytesseract words = pytesseract.image_to_string(Image.open('...xxx/test.png'), lang='chi_sim+eng') print(words)
此文参考:
https://blog.csdn.net/ad_yangang/article/details/121294009
https://zhuanlan.zhihu.com/p/122495884
https://zhuanlan.zhihu.com/p/35687577