【鉴黄】nsfw鉴黄
训练模型
nsfw_data_scraper训练
- NSFW:不适合在工作场合出现的内容(英语:Not Safe/Suitable For Work,缩写:NSFW)是一个网络用语,多指裸露、暴力、色情或冒犯等不适宜公众场合的内容。在给出含有上述内容的超链接旁标注 NSFW,用于警告观看者。
在 nsfw_data_scraper
上传存放了成千上万张图片地址,并对图片进行了分类,以供训练:
- 绘画(Drawing),无害的艺术,或艺术绘画;
- 变态(Hentai),色情艺术,不适合大多数工作环境;
- 中立(Neutral),一般,无害的内容;
- 色情(Porn),不雅的内容和行为,通常涉及生殖器;
- 性感(Sexy),不合时宜的挑衅内容。
同时,官方也提供了收集方法:
$ docker build . -t docker_nsfw_data_scraper Sending build context to Docker daemon 426.3MB Step 1/3 : FROM ubuntu:18.04 ---> 775349758637 Step 2/3 : RUN apt update && apt upgrade -y && apt install wget rsync imagemagick default-jre -y ---> Using cache ---> b2129908e7e2 Step 3/3 : ENTRYPOINT ["/bin/bash"] ---> Using cache ---> d32c5ae5235b Successfully built d32c5ae5235b Successfully tagged docker_nsfw_data_scraper:latest $ # Next command might run for several hours. It is recommended to leave it overnight $ docker run -v $(pwd):/root/nsfw_data_scraper docker_nsfw_data_scraper scripts/runall.sh Getting images for class: neutral ... ... $ ls data test train $ ls data/train/ drawings hentai neutral porn sexy $ ls data/test/ drawings hentai neutral porn sexy
模型
训练好的模型 https://github.com/rockyzhengwu/nsfw
- git clone https://github.com/rockyzhengwu/nsfw
训练好的模型在 data/
目录下。
- cd nsfw
- python nsfw_predict.py /tmp/test/test.jpeg
输出结果:
- {'class': 'sexy', 'probability': {'drawings': 0.008320281, 'hentai': 0.0011919827, 'neutral': 0.13077603, 'porn': 0.13146976, 'sexy': 0.72824186}}
class
: 图片所属列表 probability
: 各类别所属的概率得分
鉴黄服务
模型数据训练好以后就是搭建服务了,这里我们直接使用TensorFlow
的 TensorFlow-serving
对外提供服务,为了安装方便,我们使用Docker
安装部署。
NSFWDATA="/home/www/nsfw/data" docker run -d --rm -p 8501:8501 \ --name nsfw \ -v "$NSFWDATA/models:/models/nsfw" \ -e MODEL_NAME=nsfw \ tensorflow/serving
serving
镜像提供了两种调用方式:gRPC
和HTTP
请求。gRPC
默认端口是8500
,HTTP
请求的默认端口是8501
,serving镜像中的程序会自动加载镜像内/models
下的模型,通过MODEL_NAME
指定/models
下的哪个模型。
HTTP调用API
地址:http://ip:port/v1/models/nsfw:predict
接口返回参数:
{ "classes": "porn", "probabilities": { "drawings": 0.0000170060648, "hentai": 0.00108581863, "neutral": 0.000101140722, "porn": 0.816358209, "sexy": 0.182437778 } }
python脚本
import sys import json import requests from PIL import Image import numpy as np _IMAGE_SIZE = 64 # TensorFlow-serving 调用地址,这里要替换成自己的,后面会讲到如何安装 SERVER_URL = 'http://192.168.1.123:8501/v1/models/nsfw:predict' _LABEL_MAP = {0: 'drawings', 1: 'hentai', 2: 'neutral', 3: 'porn', 4: 'sexy'} def standardize(img): mean = np.mean(img) std = np.std(img) img = (img - mean) / std return img # 导入 def load_image(image_path): img = Image.open(image_path) img = img.resize((_IMAGE_SIZE, _IMAGE_SIZE)) img.load() data = np.asarray(img, dtype="float32") data = standardize(data) data = data.astype(np.float16, copy=False) return data # 分析 def nsfw_predict(image_data): pay_load = json.dumps({"inputs": [image_data.tolist()]}) response = requests.post(SERVER_URL, data=pay_load) data = response.json() outputs = data['outputs'] predict_result = {"classes": _LABEL_MAP.get(outputs['classes'][0])} predict_result['probabilities'] = {_LABEL_MAP.get(i): l for i, l in enumerate(outputs['probabilities'][0])} return predict_result if __name__ == '__main__': image_data = load_image(sys.argv[1]) predict = nsfw_predict(image_data) print(predict)
安装python3和pip3
安装脚本依赖
sudo pip3 install Image # 如果报错 sudo yum install python3-devel sudo pip3 install requests sudo pip3 install numpy
执行
[root@localhost data]# python3 yellow /home/1.jpeg {'classes': 'neutral', 'probabilities': {'drawings': 0.00102156017, 'hentai': 0.000259996828, 'neutral': 0.998707414, 'porn': 1.07145597e-05, 'sexy': 3.48052083e-07}}
Java调用python
// yellowPath 脚本存放位置,后面会讲到 @Value("${yellow.path}") private String yellowPath; /** * 自家库鉴黄 * @param imagePath 作为参数供Python脚本使用 * @return */ public String check(String imagePath) { String[] arguments = new String[] {"python3",yellowPath,imagePath}; String classes = ""; try { String line = null; Process process = Runtime.getRuntime().exec(arguments); BufferedReader in = new BufferedReader(new InputStreamReader(process.getInputStream(),"GBK")); while ((line = in.readLine()) != null) { System.out.println(line); classes = line; } in.close(); int re = process.waitFor(); System.out.println(re); } catch (Exception e) { e.printStackTrace(); } return classes; }
参考链接:
https://blog.csdn.net/zhulin2012/article/details/106394274/
https://blog.52itstyle.vip/archives/4863/#%E9%89%B4%E9%BB%84%E6%9C%8D%E5%8A%A1