【鉴黄】nsfw鉴黄

训练模型

nsfw_data_scraper训练

  • NSFW:不适合在工作场合出现的内容(英语:Not Safe/Suitable For Work,缩写:NSFW)是一个网络用语,多指裸露、暴力、色情或冒犯等不适宜公众场合的内容。在给出含有上述内容的超链接旁标注 NSFW,用于警告观看者。

在 nsfw_data_scraper上传存放了成千上万张图片地址,并对图片进行了分类,以供训练:

  • 绘画(Drawing),无害的艺术,或艺术绘画;
  • 变态(Hentai),色情艺术,不适合大多数工作环境;
  • 中立(Neutral),一般,无害的内容;
  • 色情(Porn),不雅的内容和行为,通常涉及生殖器;
  • 性感(Sexy),不合时宜的挑衅内容。

同时,官方也提供了收集方法:

$ docker build . -t docker_nsfw_data_scraper
Sending build context to Docker daemon  426.3MB
Step 1/3 : FROM ubuntu:18.04
 ---> 775349758637
Step 2/3 : RUN apt update  && apt upgrade -y  && apt install wget rsync imagemagick default-jre -y
 ---> Using cache
 ---> b2129908e7e2
Step 3/3 : ENTRYPOINT ["/bin/bash"]
 ---> Using cache
 ---> d32c5ae5235b
Successfully built d32c5ae5235b
Successfully tagged docker_nsfw_data_scraper:latest
$ # Next command might run for several hours. It is recommended to leave it overnight
$ docker run -v $(pwd):/root/nsfw_data_scraper docker_nsfw_data_scraper scripts/runall.sh
Getting images for class: neutral
...
...
$ ls data
test  train
$ ls data/train/
drawings  hentai  neutral  porn  sexy
$ ls data/test/
drawings  hentai  neutral  porn  sexy

模型

训练好的模型 https://github.com/rockyzhengwu/nsfw

  • git clone https://github.com/rockyzhengwu/nsfw

训练好的模型在 data/目录下。

  • cd nsfw
  • python nsfw_predict.py /tmp/test/test.jpeg

输出结果:

  • {'class': 'sexy', 'probability': {'drawings': 0.008320281, 'hentai': 0.0011919827, 'neutral': 0.13077603, 'porn': 0.13146976, 'sexy': 0.72824186}}

class: 图片所属列表 probability: 各类别所属的概率得分

鉴黄服务

模型数据训练好以后就是搭建服务了,这里我们直接使用TensorFlow 的 TensorFlow-serving 对外提供服务,为了安装方便,我们使用Docker安装部署。

NSFWDATA="/home/www/nsfw/data"
docker run -d --rm -p 8501:8501 \
   --name nsfw \
   -v "$NSFWDATA/models:/models/nsfw" \
   -e MODEL_NAME=nsfw \
   tensorflow/serving

serving 镜像提供了两种调用方式:gRPCHTTP请求。gRPC默认端口是8500HTTP请求的默认端口是8501,serving镜像中的程序会自动加载镜像内/models下的模型,通过MODEL_NAME指定/models下的哪个模型。

HTTP调用API地址:http://ip:port/v1/models/nsfw:predict

接口返回参数:

{
    "classes": "porn", 
    "probabilities": {
        "drawings": 0.0000170060648, 
        "hentai": 0.00108581863, 
        "neutral": 0.000101140722, 
        "porn": 0.816358209, 
        "sexy": 0.182437778
    }
}

python脚本

import sys
import json
import requests
 
from PIL import Image
import numpy as np
 
_IMAGE_SIZE = 64
# TensorFlow-serving 调用地址,这里要替换成自己的,后面会讲到如何安装
SERVER_URL = 'http://192.168.1.123:8501/v1/models/nsfw:predict'
_LABEL_MAP = {0: 'drawings', 1: 'hentai', 2: 'neutral', 3: 'porn', 4: 'sexy'}
 
def standardize(img):
    mean = np.mean(img)
    std = np.std(img)
    img = (img - mean) / std
    return img
 
# 导入
def load_image(image_path):
    img = Image.open(image_path)
    img = img.resize((_IMAGE_SIZE, _IMAGE_SIZE))
    img.load()
    data = np.asarray(img, dtype="float32")
    data = standardize(data)
    data = data.astype(np.float16, copy=False)
    return data
 
# 分析
def nsfw_predict(image_data):
    pay_load = json.dumps({"inputs": [image_data.tolist()]})
    response = requests.post(SERVER_URL, data=pay_load)
    data = response.json()
    outputs = data['outputs']
    predict_result = {"classes": _LABEL_MAP.get(outputs['classes'][0])}
    predict_result['probabilities'] = {_LABEL_MAP.get(i): l for i, l in enumerate(outputs['probabilities'][0])}
    return predict_result
 
 
if __name__ == '__main__':
    image_data = load_image(sys.argv[1])
    predict = nsfw_predict(image_data)
    print(predict)

安装python3和pip3

安装脚本依赖

sudo pip3 install Image # 如果报错 sudo yum install python3-devel
sudo pip3 install requests
sudo pip3 install numpy

执行

[root@localhost data]# python3 yellow /home/1.jpeg
{'classes': 'neutral', 'probabilities': {'drawings': 0.00102156017, 'hentai': 0.000259996828, 'neutral': 0.998707414, 'porn': 1.07145597e-05, 'sexy': 3.48052083e-07}}

Java调用python

// yellowPath 脚本存放位置,后面会讲到
@Value("${yellow.path}")
private String yellowPath;
 
/**
 * 自家库鉴黄
 * @param imagePath 作为参数供Python脚本使用
 * @return
 */
public String check(String imagePath) {
    String[] arguments = new String[] {"python3",yellowPath,imagePath};
    String classes = "";
    try {
        String line = null;
        Process process = Runtime.getRuntime().exec(arguments);
        BufferedReader in = new BufferedReader(new InputStreamReader(process.getInputStream(),"GBK"));
        while ((line = in.readLine()) != null) {
            System.out.println(line);
            classes = line;
        }
        in.close();
        int re = process.waitFor();
        System.out.println(re);
    } catch (Exception e) {
        e.printStackTrace();
    }
    return classes;
}

 

参考链接: 

https://blog.csdn.net/zhulin2012/article/details/106394274/

https://blog.52itstyle.vip/archives/4863/#%E9%89%B4%E9%BB%84%E6%9C%8D%E5%8A%A1

 

posted @ 2022-02-11 15:20    阅读(3419)  评论(0编辑  收藏  举报