为训练深度OCR 图像,生成文本图像


Generate text images for training deep learning ocr model


在Windows中也可以运行,只需要将Unicode编码 encoding='utf-8' 即可。




 运行 python main.py --help 可以看到在生成自己的 文本图像时需要设置的一些参数.


语料模型:curpus_mode  包括:chn,eng,random,list



图像保存:--tag  在输出目录的一个子目录下面


python main.py --corpus_mode chn --corpus_dir MY_corpus --output_dir MY_samples --tag image

一定要有 tag 参数设定 ,否则跑不出结果。







Text Renderer

Generate text images for training deep learning OCR model (e.g. CRNN). Support both latin and non-latin text.


  • Ubuntu 16.04
  • python 3.5+

Install dependencies:

pip3 install -r requirements.txt


By default, simply run python3 main.py will generate 20 text images and a labels.txt file in output/default/.

example1.jpg example2.jpg

example3.jpg example4.jpg

Use your own data to generate image

  1. Please run python3 main.py --help to see all optional arguments and their meanings. And put your own data in corresponding folder.

  2. Config text effects and fraction in configs/default.yaml file(or create a new config file and use it by --config_fileoption), here are some examples:

Effect nameImage
Origin(Font size 25) origin
Perspective Transform perspective
Random Crop rand_crop
Curve curve
Light border light border
Dark border dark border
Random char space big random char space big
Random char space small random char space small
Middle line middle line
Table line table line
Under line under line
Emboss emboss
Reverse color reverse color
Blur blur
  1. Run main.py file.

Strict mode

For no-latin language(e.g Chinese), it's very common that some fonts only support limited chars. In this case, you will get bad results like these:




Select fonts that support all chars in --chars_file is annoying. Run main.py with --strict option, renderer will retry get text from corpus during generate processing until all chars are supported by a font.


You can use check_font.py script to check how many chars your font not support in --chars_file:

python3 tools/check_font.py

checking font ./data/fonts/eng/Hack-Regular.ttf
chars not supported(4971):
['第', '朱', '广', '沪', '联', '自', '治', '县', '驼', '身', '进', '行', '纳', '税', '防', '火', '墙', '掏', '心', '内', '容', '万', '警','钟', '上', '了', '解'...]
0 fonts support all chars(5071) in ./data/chars/chn.txt:

Generate image using GPU

If you want to use GPU to make generate image faster, first compile opencv with CUDA. Compiling OpenCV with CUDA support

Then build Cython part, and add --gpu option when run main.py

cd libs/gpu
python3 setup.py build_ext --inplace

Debug mode

Run python3 main.py --debug will save images with extract information. You can see how perspectiveTransform works and all bounding/rotated boxes.



See https://github.com/Sanster/text_renderer/projects/1

posted @ 2018-10-11 16:29  静悟生慧  阅读(780)  评论(0编辑  收藏  举报