基于 Keras 和 CRNN 网络的验证码识别实现

在本文中，我们将使用 Keras 框架构建一个验证码识别系统。该系统结合了卷积神经网络（CNN）和递归神经网络（RNN），并采用 CRNN（卷积递归神经网络）结构来处理验证码图像中的字符识别。通过这种方式，我们能够有效地识别包含多个字符的验证码。

环境准备
首先，确保已经安装了 Keras 和 TensorFlow。如果你尚未安装，可以通过以下命令进行安装：

bash

pip install tensorflow opencv-python numpy pillow
Keras：用于构建和训练深度学习模型。
TensorFlow：Keras 的后端框架。
OpenCV：用于图像处理。
NumPy：用于处理数据。
2. 数据集准备与预处理
验证码通常包含多个字符，因此我们需要对图像进行一定的预处理，来提取出每个字符并将其输入到深度学习模型中。以下是图像预处理的基本步骤：

灰度化：将图像转换为灰度图，减少颜色干扰。
二值化：将图像转化为黑白图，增强字符与背景的对比度。
去噪声：使用高斯模糊去除图像中的噪点。
字符分割：将图像中的字符区域分离开，方便后续处理。
(1) 图像预处理代码
python

import cv2
import numpy as np

def preprocess_image(img_path):
# 读取图像
img = cv2.imread(img_path)

# 转为灰度图
gray = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)

# 二值化处理
_, binary = cv2.threshold(gray, 0, 255, cv2.THRESH_BINARY_INV + cv2.THRESH_OTSU)

# 去噪：高斯模糊
blurred = cv2.GaussianBlur(binary, (5, 5), 0)

# 轮廓检测 (可选)
contours, _ = cv2.findContours(blurred, cv2.RETR_EXTERNAL, cv2.CHAIN_APPROX_SIMPLE)

return blurred, contours

示例图像路径

img_path = 'captcha_images/test1.png'
preprocessed_image, contours = preprocess_image(img_path)

显示处理结果

cv2.imshow('Processed Image', preprocessed_image)
cv2.waitKey(0)
cv2.destroyAllWindows()
(2) 提取字符区域
通过对处理后的图像进行轮廓检测，我们可以提取出字符区域：

python

def extract_characters(preprocessed_img, contours):
char_images = []
for contour in contours:
x, y, w, h = cv2.boundingRect(contour)
if w > 10 and h > 10: # 排除噪点
char_img = preprocessed_img[y:y+h, x:x+w]
char_images.append(char_img)

# 按照从左到右的顺序排序字符图像
char_images.sort(key=lambda x: x[0][0])  # 排序依据是字符左上角的 x 坐标
return char_images

提取字符区域

char_images = extract_characters(preprocessed_image, contours)

显示提取出的字符

for i, char_img in enumerate(char_images):
cv2.imshow(f'Character {i+1}', char_img)
cv2.waitKey(0)

cv2.destroyAllWindows()
3. 构建 CRNN 模型
为了处理验证码图像中的字符，我们将构建一个 CRNN（卷积递归神经网络）模型。该模型结合了卷积神经网络（CNN）和循环神经网络（RNN），CNN 用于提取图像特征，RNN 用于处理字符之间的时序依赖。

(1) 构建 CRNN 模型
python

from tensorflow.keras import layers, models

def build_crnn_model(input_shape=(64, 128, 1), num_classes=36, sequence_length=4):
input_img = layers.Input(shape=input_shape)

# 卷积层1
x = layers.Conv2D(32, (3, 3), activation='relu', padding='same')(input_img)
x = layers.MaxPooling2D((2, 2))(x)

# 卷积层2
x = layers.Conv2D(64, (3, 3), activation='relu', padding='same')(x)
x = layers.MaxPooling2D((2, 2))(x)

# 卷积层3
x = layers.Conv2D(128, (3, 3), activation='relu', padding='same')(x)

# 转换为一维特征
x = layers.Reshape(target_shape=(-1, 128))(x)

# LSTM 层：处理序列数据
x = layers.Bidirectional(layers.LSTM(128, return_sequences=True))(x)

# 输出层
output = layers.Dense(num_classes, activation='softmax')(x)

model = models.Model(inputs=input_img, outputs=output)

return model

构建 CRNN 模型

model = build_crnn_model(input_shape=(64, 128, 1), num_classes=36, sequence_length=4)

编译模型

model.compile(optimizer='adam', loss='sparse_categorical_crossentropy', metrics=['accuracy'])
model.summary()
(2) 模型架构说明
卷积层：用于提取图像特征。
LSTM 层：通过循环神经网络捕捉字符的时序信息。
输出层：每个字符的分类输出，使用 softmax 激活函数。
4. 训练模型
训练模型时，我们使用图像数据作为输入，标签数据作为目标输出。标签通常表示验证码图像中的字符序列。

(1) 数据集加载与训练
python

import numpy as np

假设我们已经处理了数据并准备好图像和标签

images = np.array([cv2.imread(img_path, cv2.IMREAD_GRAYSCALE) for img_path in image_paths])
labels = np.array(labels) # 标签是一个字符的序列

对图像进行预处理：标准化，增加通道维度

images = images.astype('float32') / 255.0
images = images.reshape(images.shape[0], 64, 128, 1) # 假设图像大小为 64x128

训练模型

model.fit(images, labels, epochs=10, batch_size=32)
5. 模型评估与预测
训练完成后，我们可以评估模型的性能，并对新的验证码图像进行预测。

(1) 评估模型
python

使用测试集评估模型

test_images = np.array([cv2.imread(img_path, cv2.IMREAD_GRAYSCALE) for img_path in test_image_paths])
test_labels = np.array(test_labels)

对图像进行标准化处理

test_images = test_images.astype('float32') / 255.0
test_images = test_images.reshape(test_images.shape[0], 64, 128, 1)

评估模型

loss, accuracy = model.evaluate(test_images, test_labels, batch_size=32)
print(f"Test Accuracy: {accuracy * 100:.2f}%")
(2) 预测验证码
使用训练好的模型对单个验证码进行预测。

python
更多内容访问ttocr.com或联系1436423940
def predict_captcha(model, img_path, char_set="ABCDEFGHIJKLMNOPQRSTUVWXYZ0123456789", sequence_length=4):
img = cv2.imread(img_path, cv2.IMREAD_GRAYSCALE)
img = cv2.resize(img, (128, 64))
img = img.astype('float32') / 255.0
img = np.expand_dims(img, axis=0) # 增加批次维度
img = np.expand_dims(img, axis=3) # 增加通道维度

# 预测
pred = model.predict(img)
pred_label = ''.join([char_set[np.argmax(pred[0][i])] for i in range(sequence_length)])

return pred_label

测试预测

test_image_path = "captcha_images/test1.png"
predicted_label = predict_captcha(model, test_image_path)
print(f"Predicted CAPTCHA label: {predicted_label}")

posted @ 2025-02-15 10:58 ttocr、com 阅读(66) 评论(0) 收藏举报

刷新页面返回顶部