clang: error: linker command failed with exit code 1 (use -v to see invocation)
在OCR项目调研过程发现一个开源工具gosseract,识别效果不错;
按部就班准备环境,先mac环境安装tesseract(gosseract依赖):
brew install tesseract
$ tesseract -v tesseract 4.1.3 leptonica-1.82.0 libgif 5.2.1 : libjpeg 9d : libpng 1.6.37 : libtiff 4.3.0 : zlib 1.2.11 : libwebp 1.2.1 : libopenjp2 2.4.0 Found AVX2 Found AVX Found FMA Found SSE
第一次安装很顺利,成功。
随着业务需求增加,需要进行语言训练,因此需要安装训练工具, 选择卸载重装:
$ brew install --with-training-tools tesseract Usage: brew install [options] formula|cask [...] Install a formula or cask. Additional options specific to a formula may be appended to the command. ... Error: invalid option: --with-training-tools
提示此安装方式已废弃。所以选择编译安装方式:
安装依赖
# Packages which are always needed. brew install automake autoconf libtool brew install pkgconfig brew install icu4c brew install leptonica # Packages required for training tools. brew install pango # Optional packages for extra features. brew install libarchive # Optional package for builds using g++. brew install gcc
下载解压
https://github.com/tesseract-ocr/tesseract/releases
安装
cd tesseract-5.1.0 ./autogen.sh mkdir build cd build # Optionally add CXX=g++-8 to the configure command if you really want to use a different compiler. ../configure PKG_CONFIG_PATH=/usr/local/opt/icu4c/lib/pkgconfig:/usr/local/opt/libarchive/lib/pkgconfig:/usr/local/opt/libffi/lib/pkgconfig make -j # Optionally install Tesseract. sudo make install # Optionally build and install training tools. make training sudo make training-install
问题:
安装好之后,编译项目报错:
2022/03/31 15:32:10 ERROR ▶ 0004 Failed to build the application: # ocr /usr/local/go/pkg/tool/darwin_amd64/link: running clang++ failed: exit status 1 Undefined symbols for architecture x86_64: "tesseract::TessBaseAPI::Init(char const*, char const*, tesseract::OcrEngineMode, char**, int, GenericVector<STRING> const*, GenericVector<STRING> const*, bool)", referenced from: Init(void*, char*, char*) in 000023.o _Init in 000023.o _GetDataPath in 000023.o "tesseract::TessBaseAPI::Recognize(ETEXT_DESC*)", referenced from: _GetBoundingBoxesVerbose in 000023.o _GetBoundingBoxes in 000023.o ld: symbol(s) not found for architecture x86_64 clang: error: linker command failed with exit code 1 (use -v to see invocation)
仅观察报错内容,没发现是版本问题,经过多次卸载重装后发现是版本太高导致的,于是重新安装了4.1.3版本后服务正常编译。
卸载方式可以手动删除安装文件,或者通过命令:
brew uninstall tesseract
但是在后续安装tesseract是会出现各种问题,如下:
$ brew install tesseract==4.1.3 Warning: No available formula with the name "tesseract==4.1.3". Did you mean tesseract? ==> Searching for similarly named formulae... This similarly named formula was found: tesseract To install it, run: brew install tesseract ==> Searching for a previously deleted formula (in the last month)... Error: No previously deleted formula found. ==> Searching taps on GitHub... Error: No formulae found in taps. liumeng@liumengdeMacBook-Pro Pictures % brew install tesseract ==> Downloading https://ghcr.io/v2/homebrew/core/tesseract/manifests/4.1.3 Already downloaded: /Users/liumeng/Library/Caches/Homebrew/downloads/9597a8ae2cb676cd25c79cf252f4eb8759b9cf3d472c57f7c764e086c5f8f6e2--tesseract-4.1.3.bottle_manifest.json ==> Downloading https://ghcr.io/v2/homebrew/core/tesseract/blobs/sha256:1b67091dce98b42c6c561981a01738fe01c19ac69a1dc4de6d8e43fe885177f0 Already downloaded: /Users/liumeng/Library/Caches/Homebrew/downloads/cf8d3fbb1aea1cc629c6873a25b11d732c90ff23bfa4c44ba23d0ce5c24e907a--tesseract--4.1.3.big_sur.bottle.tar.gz ==> Pouring tesseract--4.1.3.big_sur.bottle.tar.gz Error: The `brew link` step did not complete successfully The formula built, but is not symlinked into /usr/local Could not symlink include/tesseract/apitypes.h /usr/local/include/tesseract is not writable. You can try again using: brew link tesseract ==> Caveats This formula contains only the "eng", "osd", and "snum" language data files. If you need any other supported languages, run `brew install tesseract-lang`. ==> Summary 🍺 /usr/local/Cellar/tesseract/4.1.3: 65 files, 29.7MB
查看报错信息,需要如下操作:
$ brew link tesseract Linking /usr/local/Cellar/tesseract/4.1.3... Error: Could not symlink include/tesseract/apitypes.h /usr/local/include/tesseract is not writable.
此时需要先删除一些文件:
$ sudo rm -rf /usr/local/include/tesseract
继续如下操作:
$ brew link tesseract Linking /usr/local/Cellar/tesseract/4.1.3... Error: Could not symlink share/tessdata/configs/alto Target /usr/local/share/tessdata/configs/alto already exists. You may want to remove it: rm '/usr/local/share/tessdata/configs/alto' To force the link and overwrite all conflicting files: brew link --overwrite tesseract To list all files that would be deleted: brew link --overwrite --dry-run tesseract
给了三种操作方法。
如下操作:
$ sudo rm -rf /usr/local/share/tessdata/configs/alto $ brew link --overwrite --dry-run tesseract Would remove: /usr/local/share/tessdata/configs/ambigs.train ... /usr/local/lib/libtesseract.dylib -> /usr/local/lib/libtesseract.5.dylib /usr/local/lib/pkgconfig/tesseract.pc liumeng@liumengdeMacBook-Pro Pictures % tesseract -v zsh: command not found: tesseract liumeng@liumengdeMacBook-Pro Pictures % brew install tesseract Updating Homebrew... ==> Auto-updated Homebrew! Updated 1 tap (homebrew/cask). ==> Updated Casks Updated 7 casks. Warning: tesseract 4.1.3 is already installed, it's just not linked. To link this version, run: brew link tesseract $ brew link --overwrite tesseract Linking /usr/local/Cellar/tesseract/4.1.3... Error: Could not symlink share/tessdata/configs/alto /usr/local/share/tessdata/configs is not writable.
继续删除:
$ sudo rm -rf /usr/local/share/tessdata/configs $ brew link --overwrite tesseract Linking /usr/local/Cellar/tesseract/4.1.3... Error: Could not symlink share/tessdata/tessconfigs/batch /usr/local/share/tessdata/tessconfigs is not writable. $ sudo rm -rf /usr/local/share/tessdata/tessconfigs $ brew link --overwrite tesseract Linking /usr/local/Cellar/tesseract/4.1.3... 12 symlinks created.
验证:
$ tesseract -v tesseract 4.1.3 leptonica-1.82.0 libgif 5.2.1 : libjpeg 9d : libpng 1.6.37 : libtiff 4.3.0 : zlib 1.2.11 : libwebp 1.2.1 : libopenjp2 2.4.0 Found AVX2 Found AVX Found FMA Found SSE
项目编译正常,结束!