Centos上安装tesseract+pytesseract用来做图片验证码的识别
转载请注明出处:http://www.cnblogs.com/blazer/p/7131202.html
环境:Centos6.7
tesseract-3.05
pytesseract-0.1.7
Imaging-1.1.7
Ubuntu
If they are not already installed, you need the following libraries (Ubuntu 16.04/14.04):
sudo apt-get install g++ # or clang++ (presumably)
sudo apt-get install autoconf automake libtool
sudo apt-get install autoconf-archive
sudo apt-get install pkg-config
sudo apt-get install libpng12-dev
sudo apt-get install libjpeg8-dev
sudo apt-get install libtiff5-dev
sudo apt-get install zlib1g-dev
if you plan to install the training tools, you also need the following libraries:
sudo apt-get install libicu-dev
sudo apt-get install libpango1.0-dev
sudo apt-get install libcairo2-dev
官方叫你装的依赖包
如果是用yum装,则有些关键字不太一样,需要慢慢装。
都安装完了之后,然后使用如下python
image = Image.open('yzm.jpeg') vcode = pytesseract.image_to_string(image)
有可能会报以下错误:
IOError: decoder jpeg not available
那么,重装Imaging-1.1.7
装的时候可能会遇到一个问题。
python selftest.py
执行该脚本能看到是否支持图片
我的Centos中是已经安装了libjpeg-turbo这个包的。
但是支持该脚本还是有如下关键字
*** TKINTER support not installed *** JPEG support not installed *** ZLIB (PNG/ZIP) support not installed *** FREETYPE2 support not installed *** LITTLECMS support not installed
那么
将
TCL_ROOT = None JPEG_ROOT = None ZLIB_ROOT = None TIFF_ROOT = None FREETYPE_ROOT = None LCMS_ROOT = None
改成
TCL_ROOT = "/usr/lib64/" JPEG_ROOT = "/usr/lib64/" ZLIB_ROOT = "/usr/lib64/" TIFF_ROOT = "/usr/lib64/" FREETYPE_ROOT = "/usr/lib64/" LCMS_ROOT = "/usr/lib64/"
然后需要重新编译和安装
python2.7 setup.py clean python2.7 setup.py build_ext python2.7 setup.py build python2.7 setup.py install