generate chinese character image with Pillow and OpenCV
背景
以图片的形式成成验证码,防止工具类软件自动化暴力破解的攻击。
需要按照不同字体,生成图片,同时还可能添加干扰项:
- 平移
- 旋转
- 添加干扰背景
- 字形扭曲
或者其他场景,生成不同字体的标准字帖,提供临摹。
参考代码
https://github.com/atlantistin/Blogs/tree/master/20190801-hanzi-datasets
TTF字体
https://zhuanlan.zhihu.com/p/28179203
TTF (TrueType Font) 字体格式是由苹果和微软为 PostScript 而开发的字体格式。在 Mac 和 Windows 操作系统上,TTF 一直是最常见的格式,所有主流浏览器都支持它。然而,IE8 不支持 TTF;且 IE9 上只有被设置成 "installable" 才能支持(译注:别想了,转别的格式吧)。
TTF 允许嵌入最基本的数字版权管理标志————内置标志可以告诉我们字体作者是否允许改字体在 PDF 或者网站等处使用,所以可能会有版权问题。另一个缺点是,TTF 和 OTF 字体是没压缩的,因此他们文件更大。
https://en.wikipedia.org/wiki/TrueType
TrueType is an outline font standard developed by Apple in the late 1980s as a competitor to Adobe's Type 1 fonts used in PostScript. It has become the most common format for fonts on the classic Mac OS, macOS, and Microsoft Windows operating systems.
The primary strength of TrueType was originally that it offered font developers a high degree of control over precisely how their fonts are displayed, right down to particular pixels, at various font sizes. With widely varying rendering technologies in use today, pixel-level control is no longer certain in a TrueType font.
https://docs.microsoft.com/en-us/typography/truetype/
TrueType is a digital font technology designed by Apple Computer, and now used by both Apple and Microsoft in their operating systems. Microsoft has distributed millions of quality TrueType fonts in hundreds of different styles, including them in its range of products and the popular TrueType Font Packs.
TrueType fonts offer the highest possible quality on computer screens and printers, and include a range of features which make them easy to use.
中文字体下载
https://www.fonts.net.cn/fonts-zh-1.html
Code(优化后)
https://github.com/fanqingsong/code_snippet/blob/master/machine_learning/pillow/chinese_chars/app.py
from PIL import Image, ImageDraw, ImageFont, ImageFilter import numpy as np import glob as gb import shutil import cv2 import os def prepare_output_dir(): if not os.path.exists(output_dir): os.mkdir(output_dir) return shutil.rmtree(output_dir) os.makedirs(output_dir) def make_background_generator(): background_image_paths = gb.glob(os.path.join(background_dir, "*")) background_image_count = len(background_image_paths) while True: background_image_index = np.random.randint(background_image_count) one_background_image_path = background_image_paths[background_image_index] yield Image.open(one_background_image_path).resize(image_size) def draw_char_on_image(char_image, char, font_path): font_size = height // 6 * 5 if not regular_mode: font_size = np.random.randint(height // 2, height // 6 * 5) font = ImageFont.truetype(font_path, font_size, encoding="utf-8") draw = ImageDraw.Draw(char_image) # font color is black by default rgb = (0, 0, 0) if not regular_mode: r, g, b = np.random.randint(150, 255), np.random.randint(150, 255), np.random.randint(150, 255) rgb = (r, g, b) xy = ((height - font_size) // 2, (width - font_size) // 2) draw.text(xy, char, rgb, font=font) def shear_image(char_image): theta = np.random.randint(-15, 15) * np.pi / 180 m_shear = np.array([[1, np.tan(theta), 0], [0, 1, 0]], dtype=np.float32) image_shear = cv2.warpAffine(np.array(char_image), m_shear, image_size) char_image = Image.fromarray(image_shear) return char_image def get_char_image(char, font_path): # setup white background image_data = np.zeros(image_shape, dtype="u1") image_data.fill(255) char_image = Image.fromarray(image_data) draw_char_on_image(char_image, char, font_path) # uglify the form of char if not regular_mode: char_image = char_image.rotate(np.random.randint(-100, 100)) char_image = char_image.filter(ImageFilter.GaussianBlur(radius=0.7)) char_image = shear_image(char_image) return char_image def make_char_image_generator_for_all_fonts(char): font_paths = gb.glob(os.path.join(font_dir, "*")) print(font_paths) for font_path in font_paths: yield get_char_image(char, font_path) def make_char_image_generator_randomly(char, n): font_paths = gb.glob(os.path.join(font_dir, "*")) print(font_paths) font_count = len(font_paths) while True: if n <= 0: break else: n -= 1 font_path = font_paths[np.random.randint(font_count)] char_image = get_char_image(char, font_path) yield char_image def make_char_image_generator(char, random=False, n=3): if not random: char_image_generator = make_char_image_generator_for_all_fonts(char) return char_image_generator char_image_generator = make_char_image_generator_randomly(char, n) return char_image_generator def get_all_chars(chars_file): with open(chars_file, 'r', encoding="utf-8") as fh: all_chars = fh.read() print(all_chars) return all_chars def prepare_one_char_dir(one_char): char_dir = os.path.join(output_dir, one_char) if not os.path.exists(char_dir): os.makedirs(char_dir) return char_dir def main(): prepare_output_dir() all_chars = get_all_chars(chars_file) for i, one_char in enumerate(all_chars): print(f"{i} word is {one_char}") char_dir = prepare_one_char_dir(one_char) char_image_generator = make_char_image_generator(one_char) background_generator = make_background_generator() char_and_background_generator = zip(char_image_generator, background_generator) for index, (char, back) in enumerate(char_and_background_generator): img_data = np.array(char) # need to blend one background pic if not regular_mode: img_data = np.array(char) // 5 * 3 + np.array(back) // 5 * 2 img_path = os.path.join(char_dir, str(index) + ".jpg") img = Image.fromarray(img_data) img.save(img_path, "JPEG") height, width = 64, 64 image_size = (height, width) image_shape = (*image_size, 3) # white background and black font regular_mode = True output_dir = "output" background_dir = "background" font_dir = "font" chars_file = "all.txt" if __name__ == "__main__": main()
默认运行,生成白底黑字的图片:
可以关闭 regular_mode,运行查看有干扰项的图片。
仿射变换
干扰代码中,有一段对字体进行形变的代码,使用了 cv2的 warpAffine 接口
warp 为弯曲的意思
affine 为仿射的意思
def shear_image(char_image): theta = np.random.randint(-15, 15) * np.pi / 180 m_shear = np.array([[1, np.tan(theta), 0], [0, 1, 0]], dtype=np.float32) image_shear = cv2.warpAffine(np.array(char_image), m_shear, image_size) char_image = Image.fromarray(image_shear) return char_image
affine
https://www.merriam-webster.com/dictionary/affine
Definition of affine
(Entry 1 of 2)
: a relative by marriage : in-lawaffine
adjectiveDefinition of affine (Entry 2 of 2)
: of, relating to, or being a transformation (such as a translation, a rotation, or a uniform stretching) that carries straight lines into straight lines and parallel lines into parallel lines but may alter distance between points and angles between lines affine geometry
n. 1.亲眷2.亲属 adj. 1.同族的2.对应的3.〔化〕亲合的;〔数〕拟似的;远交的4.【数学】远交的,仿射的
仿射变换
不同维空间的 线性变换。
https://baike.baidu.com/item/%E4%BB%BF%E5%B0%84%E5%87%BD%E6%95%B0/9276178
https://www.cnblogs.com/bnuvincent/p/6691189.html
仿射变换(Affine Transformation)
Affine Transformation是一种二维坐标到二维坐标之间的线性变换,保持二维图形的“平直性”(译注:straightness,即变换后直线还是直线不会打弯,圆弧还是圆弧)和“平行性”(译注:parallelness,其实是指保二维图形间的相对位置关系不变,平行线还是平行线,相交直线的交角不变。)。c和d的区别可以看下图:
仿射变换可以通过一系列的原子变换的复合来实现,包括:平移(Translation)、缩放(Scale)、翻转(Flip)、旋转(Rotation)和剪切(Shear)。
仿射变换可以用下面公式表示:
参考:http://wenku.baidu.com/view/826a796027d3240c8447ef20.html
shear变换 -- 剪切力变换
https://www.quora.com/What-is-the-difference-between-stress-and-shear-stress
See the picture of the element below for an idea of normal and shear stresses, and how each deforms the element.
https://www.ques10.com/p/22033/define-the-terms-with-example-1-reflection-2-shear/
The three basic transformations of scaling, rotating, and translating are the most useful and most common. There are some other transformations which are useful in certain applications. Two such transformations are reflection and shear.
Reflection:-
A reflection is a transformation that produces a mirror image of an object relative to an axis of reflection. We can choose an axis of reflection in the xy plane or perpendicular to the xy plane.
Shear:-
A transformation that slants the shape of an object is called the shear transformation.Two common shearing transfor-mations are used.One shifts x co-ordinate values and other shifts y co-ordinate values. However, in both the cases only one co-ordinate (x or y) changes its co-ordinates and other preserves its values.
X Shear:-
The x shear preserves the y co-ordinates, but changes the x values which causes vertical lines to tilt right or left as shown in the figure below . The transformation matrix for x shear is given as
Y shear:-
The y shear preserves the x coordinates, but changes the y values which causes horizontal lines to transform into lines which slope up or down, as shown in the figure below. The transformation matrix for y shear is given as