下载识别引擎

网上可以找到 MODI 的安装包,但用里面的脚本安装后,可能配置信息不全,只能识别法文。

马健先生的原创空间里,也有文字识别(OCR)引擎可下载。详细安装方法,请参考下载空间里的文档。

关于微软 Office 文档处理,可参考马健先生的 MODI 说明链接:https://www.cnblogs.com/stronghorse/p/4913447.html

配置注册表

如果出现只能识别法文的现象,可以将下面的代码保存为一个reg文件,导入注册表,在 PDF 补丁丁里就能找到简繁中文和英文三种语言了。

Windows Registry Editor Version 5.00

[HKEY_CLASSES_ROOT\Installer\Components\61BA386016BD0C340BBEAC273D84FD5F]
"1028"=hex(7):76,00,55,00,70,00,41,00,56,00,4f,00,65,00,64,00,40,00,24,00,21,\
00,21,00,21,00,21,00,21,00,4d,00,4b,00,4b,00,53,00,6b,00,4f,00,43,00,52,00,\
5f,00,31,00,30,00,32,00,38,00,3c,00,00,00,00,00
"2052"=hex(7):76,00,55,00,70,00,41,00,56,00,53,00,2e,00,7d,00,58,00,25,00,21,\
00,21,00,21,00,21,00,21,00,4d,00,4b,00,4b,00,53,00,6b,00,4f,00,43,00,52,00,\
5f,00,32,00,30,00,35,00,32,00,3c,00,00,00,00,00
"1033"=hex(7):76,00,55,00,70,00,41,00,56,00,54,00,28,00,38,00,41,00,24,00,21,\
00,21,00,21,00,21,00,21,00,4d,00,4b,00,4b,00,53,00,6b,00,4f,00,43,00,52,00,\
5f,00,31,00,30,00,33,00,33,00,3e,00,26,00,61,00,45,00,4d,00,61,00,65,00,2c,\
00,37,00,71,00,39,00,2a,00,44,00,58,00,64,00,55,00,40,00,45,00,50,00,69,00,\
3d,00,00,00,00,00

要使用 MODI 识别文本,还必须以管理员身份来启动 PDF 补丁丁,否则还是会遇到调用失败的问题。

posted on 2022-06-02 08:51  PDF补丁丁  阅读(2848)  评论(1编辑  收藏  举报