TextAttack的使用功能 - 蔚蓝色の天空

一、背景

TextAttack是弗吉尼亚大学和MIT开发的一个关于快速实现文本对抗攻击的一个工具包。国内有一个清华开发的工具OpenAttack。但是目前比较活跃的是TextAttack，写这篇博客的时候，TextAttack还做了更新。

二、安装

TextAttack发行了PyPI的包，直接通过下面的指令安装，要求是Python高于3.6的版本：

pip install textattack

三、使用

3.1 端到端的使用

TextAttack可以使用以下的指令在端到端实现攻击。

textattack attack
--model bert-base-uncased
--num-examples -1
--transformation word-swap-embedding
--constraints use repeat stopword max-words-perturbed^max_num_words=3 embedding^min_cos_sim=0.8 part-of-speech
--goal-function untargeted-classification
--attack-recipe textfooler
--log-to-csv ./bert_convid_textfooler.csv
--dataset-from-file subjectivity_data_trans.py

参数名	含义
model	指定使用的模型，可以从huggingface下载，也可以在本地加载，需要使用pytorch
num-examples	转换的数量，-1表示所有
transformation	对文本输入的转换，分为转义和同义词两大类
constraints	对抗攻击需要的约束
goal-function	攻击的目标
attack-recipe	攻击的方法，可用的攻击方法
log-to-csv	输出到文件的信息
dataset-from-file	使用自定的数据集，也可以使用内置的数据，如果使用内置的数据就是在模型后面带上数据集

3.2 dataset-from-file的例子

使用下面的代码，就可以完成加载自己的数据集。

import pandas as pd
from textattack.datasets import Dataset
dataset_name = "convid"
# 读取自定义数据集
pf = pd.read_csv(f"./{dataset_name}_test.csv")
dataset = Dataset(list(zip(pf["text"], pf["label"])))

posted on 2023-08-07 22:21 蔚蓝色の天空阅读(286) 评论(0) 编辑收藏举报

刷新页面返回顶部