Adapting TextAttack to Chinese Language

Open In Colab

View Source on GitHub

Please remember to run the following in your notebook enviroment before running the tutorial codes:

pip3 install textattack[tensorflow]

With a few additional modifications to the standard TextAttack commands, lanaguage models in Chinese can be attacked just as English models. Four transformations are available for either Chinese attack or augmentation:

  1. ChineseHomophoneCharacterSwap: transforms an input by replacing its words with substitions that share similar/identical pronounciation.

  2. ChineseMorphonymCharacterSwap: transforms an input by replacing its words with substitions that share similar glyph structures.

  3. ChineseWordSwapHowNet: transforms an input by replacing its words with synonyms provided by OpenHownet.

  4. ChineseWordSwapMaskedLM: transforms an input with potential replacements using a masked language model.

We begin with imports:

[ ]:
# Import required packages
import transformers
import string
import os
import pandas as pd
import datasets

# Import classes required to build an Attacker
from textattack.models.wrappers import HuggingFaceModelWrapper
from textattack.search_methods import GreedyWordSwapWIR
from textattack.constraints.pre_transformation import (
    RepeatModification,
    StopwordModification,
)
from textattack.goal_functions import UntargetedClassification

from textattack import Attack, Attacker, AttackArgs
from textattack.loggers import CSVLogger
from textattack.datasets import Dataset, HuggingFaceDataset

# Import optional MUSE for higher quality examples
from textattack.constraints.semantics.sentence_encoders import (
    MultilingualUniversalSentenceEncoder,
)

muse = MultilingualUniversalSentenceEncoder(
    threshold=0.9,
    metric="cosine",
    compare_against_original=True,
    window_size=15,
    skip_text_shorter_than_window=True,
)

# Import the transformations

from textattack.transformations import CompositeTransformation
from textattack.transformations import ChineseWordSwapMaskedLM
from textattack.transformations import ChineseMorphonymCharacterSwap
from textattack.transformations import ChineseWordSwapHowNet
from textattack.transformations import ChineseHomophoneCharacterSwap

Models and datasets would also need to be set up:

[ ]:
# In this example, we will attack a pre-trained entailment model from HugginFace (https://huggingface.co/uer/roberta-base-finetuned-chinanews-chinese)
tokenizer = transformers.AutoTokenizer.from_pretrained(
    "uer/roberta-base-finetuned-chinanews-chinese"
)
model = transformers.AutoModelForSequenceClassification.from_pretrained(
    "uer/roberta-base-finetuned-chinanews-chinese"
)
model_wrapper = HuggingFaceModelWrapper(model, tokenizer)

# Set goal function
goal_function = UntargetedClassification(model_wrapper, query_budget=10000)

# Set dataset from which we will generate adversraial examples
path = os.path.abspath("")
path_list = path.split(os.sep)
temppath = os.path.normpath("examples/dataset/zh_sentiment/entailment_dataset.tsv")
dataset = datasets.load_dataset("csv", data_files=temppath, delimiter="\t")["train"]
dataset = HuggingFaceDataset(
    dataset,
    dataset_columns=(["text"], "label"),
    label_names=[
        "Mainland China politics",
        "Hong Kong - Macau politics",
        "International news",
        "Financial news",
        "Culture",
        "Entertainment",
        "Sports",
    ],
)

If this is your first time running Hownet, run this code block

[ ]:
import OpenHowNet

OpenHowNet.download()

Now we are ready to attack! With goal function, transformation, constraints, search method, and goal function, we create the Attacker as any other TextAttack attacks

[ ]:
# transformation, using ChineseWordSwapMaskedLM transformation in this example

transformation = ChineseWordSwapMaskedLM()

# constraint
stopwords = set(
    [
        "、",
        "。",
        "〈",
        "〉",
        "《",
        "》",
        "一",
        "一个",
        "一些",
        "一何",
        "一切",
        "一则",
        "一方面",
        "一旦",
        "一来",
        "一样",
        "一种",
        "一般",
        "一转眼",
        "七",
        "万一",
        "三",
        "上",
        "上下",
        "下",
        "不",
        "不仅",
        "不但",
        "不光",
        "不单",
        "不只",
        "不外乎",
        "不如",
        "不妨",
        "不尽",
        "不尽然",
        "不得",
        "不怕",
        "不惟",
        "不成",
        "不拘",
        "不料",
        "不是",
        "不比",
        "不然",
        "不特",
        "不独",
        "不管",
        "不至于",
        "不若",
        "不论",
        "不过",
        "不问",
        "与",
        "与其",
        "与其说",
        "与否",
        "与此同时",
        "且",
        "且不说",
        "且说",
        "两者",
        "个",
        "个别",
        "中",
        "临",
        "为",
        "为了",
        "为什么",
        "为何",
        "为止",
        "为此",
        "为着",
        "乃",
        "乃至",
        "乃至于",
        "么",
        "之",
        "之一",
        "之所以",
        "之类",
        "乌乎",
        "乎",
        "乘",
        "九",
        "也",
        "也好",
        "也罢",
        "了",
        "二",
        "二来",
        "于",
        "于是",
        "于是乎",
        "云云",
        "云尔",
        "五",
        "些",
        "亦",
        "人",
        "人们",
        "人家",
        "什",
        "什么",
        "什么样",
        "今",
        "介于",
        "仍",
        "仍旧",
        "从",
        "从此",
        "从而",
        "他",
        "他人",
        "他们",
        "他们们",
        "以",
        "以上",
        "以为",
        "以便",
        "以免",
        "以及",
        "以故",
        "以期",
        "以来",
        "以至",
        "以至于",
        "以致",
        "们",
        "任",
        "任何",
        "任凭",
        "会",
        "似的",
        "但",
        "但凡",
        "但是",
        "何",
        "何以",
        "何况",
        "何处",
        "何时",
        "余外",
        "作为",
        "你",
        "你们",
        "使",
        "使得",
        "例如",
        "依",
        "依据",
        "依照",
        "便于",
        "俺",
        "俺们",
        "倘",
        "倘使",
        "倘或",
        "倘然",
        "倘若",
        "借",
        "借傥然",
        "假使",
        "假如",
        "假若",
        "做",
        "像",
        "儿",
        "先不先",
        "光",
        "光是",
        "全体",
        "全部",
        "八",
        "六",
        "兮",
        "共",
        "关于",
        "关于具体地说",
        "其",
        "其一",
        "其中",
        "其二",
        "其他",
        "其余",
        "其它",
        "其次",
        "具体地说",
        "具体说来",
        "兼之",
        "内",
        "再",
        "再其次",
        "再则",
        "再有",
        "再者",
        "再者说",
        "再说",
        "冒",
        "冲",
        "况且",
        "几",
        "几时",
        "凡",
        "凡是",
        "凭",
        "凭借",
        "出于",
        "出来",
        "分",
        "分别",
        "则",
        "则甚",
        "别",
        "别人",
        "别处",
        "别是",
        "别的",
        "别管",
        "别说",
        "到",
        "前后",
        "前此",
        "前者",
        "加之",
        "加以",
        "区",
        "即",
        "即令",
        "即使",
        "即便",
        "即如",
        "即或",
        "即若",
        "却",
        "去",
        "又",
        "又及",
        "及",
        "及其",
        "及至",
        "反之",
        "反而",
        "反过来",
        "反过来说",
        "受到",
        "另",
        "另一方面",
        "另外",
        "另悉",
        "只",
        "只当",
        "只怕",
        "只是",
        "只有",
        "只消",
        "只要",
        "只限",
        "叫",
        "叮咚",
        "可",
        "可以",
        "可是",
        "可见",
        "各",
        "各个",
        "各位",
        "各种",
        "各自",
        "同",
        "同时",
        "后",
        "后者",
        "向",
        "向使",
        "向着",
        "吓",
        "吗",
        "否则",
        "吧",
        "吧哒",
        "含",
        "吱",
        "呀",
        "呃",
        "呕",
        "呗",
        "呜",
        "呜呼",
        "呢",
        "呵",
        "呵呵",
        "呸",
        "呼哧",
        "咋",
        "和",
        "咚",
        "咦",
        "咧",
        "咱",
        "咱们",
        "咳",
        "哇",
        "哈",
        "哈哈",
        "哉",
        "哎",
        "哎呀",
        "哎哟",
        "哗",
        "哟",
        "哦",
        "哩",
        "哪",
        "哪个",
        "哪些",
        "哪儿",
        "哪天",
        "哪年",
        "哪怕",
        "哪样",
        "哪边",
        "哪里",
        "哼",
        "哼唷",
        "唉",
        "唯有",
        "啊",
        "啐",
        "啥",
        "啦",
        "啪达",
        "啷当",
        "喂",
        "喏",
        "喔唷",
        "喽",
        "嗡",
        "嗡嗡",
        "嗬",
        "嗯",
        "嗳",
        "嘎",
        "嘎登",
        "嘘",
        "嘛",
        "嘻",
        "嘿",
        "嘿嘿",
        "四",
        "因",
        "因为",
        "因了",
        "因此",
        "因着",
        "因而",
        "固然",
        "在",
        "在下",
        "在于",
        "地",
        "基于",
        "处在",
        "多",
        "多么",
        "多少",
        "大",
        "大家",
        "她",
        "她们",
        "好",
        "如",
        "如上",
        "如上所述",
        "如下",
        "如何",
        "如其",
        "如同",
        "如是",
        "如果",
        "如此",
        "如若",
        "始而",
        "孰料",
        "孰知",
        "宁",
        "宁可",
        "宁愿",
        "宁肯",
        "它",
        "它们",
        "对",
        "对于",
        "对待",
        "对方",
        "对比",
        "将",
        "小",
        "尔",
        "尔后",
        "尔尔",
        "尚且",
        "就",
        "就是",
        "就是了",
        "就是说",
        "就算",
        "就要",
        "尽",
        "尽管",
        "尽管如此",
        "岂但",
        "己",
        "已",
        "已矣",
        "巴",
        "巴巴",
        "年",
        "并",
        "并且",
        "庶乎",
        "庶几",
        "开外",
        "开始",
        "归",
        "归齐",
        "当",
        "当地",
        "当然",
        "当着",
        "彼",
        "彼时",
        "彼此",
        "往",
        "待",
        "很",
        "得",
        "得了",
        "怎",
        "怎么",
        "怎么办",
        "怎么样",
        "怎奈",
        "怎样",
        "总之",
        "总的来看",
        "总的来说",
        "总的说来",
        "总而言之",
        "恰恰相反",
        "您",
        "惟其",
        "慢说",
        "我",
        "我们",
        "或",
        "或则",
        "或是",
        "或曰",
        "或者",
        "截至",
        "所",
        "所以",
        "所在",
        "所幸",
        "所有",
        "才",
        "才能",
        "打",
        "打从",
        "把",
        "抑或",
        "拿",
        "按",
        "按照",
        "换句话说",
        "换言之",
        "据",
        "据此",
        "接着",
        "故",
        "故此",
        "故而",
        "旁人",
        "无",
        "无宁",
        "无论",
        "既",
        "既往",
        "既是",
        "既然",
        "日",
        "时",
        "时候",
        "是",
        "是以",
        "是的",
        "更",
        "曾",
        "替",
        "替代",
        "最",
        "月",
        "有",
        "有些",
        "有关",
        "有及",
        "有时",
        "有的",
        "望",
        "朝",
        "朝着",
        "本",
        "本人",
        "本地",
        "本着",
        "本身",
        "来",
        "来着",
        "来自",
        "来说",
        "极了",
        "果然",
        "果真",
        "某",
        "某个",
        "某些",
        "某某",
        "根据",
        "欤",
        "正值",
        "正如",
        "正巧",
        "正是",
        "此",
        "此地",
        "此处",
        "此外",
        "此时",
        "此次",
        "此间",
        "毋宁",
        "每",
        "每当",
        "比",
        "比及",
        "比如",
        "比方",
        "没奈何",
        "沿",
        "沿着",
        "漫说",
        "点",
        "焉",
        "然则",
        "然后",
        "然而",
        "照",
        "照着",
        "犹且",
        "犹自",
        "甚且",
        "甚么",
        "甚或",
        "甚而",
        "甚至",
        "甚至于",
        "用",
        "用来",
        "由",
        "由于",
        "由是",
        "由此",
        "由此可见",
        "的",
        "的确",
        "的话",
        "直到",
        "相对而言",
        "省得",
        "看",
        "眨眼",
        "着",
        "着呢",
        "矣",
        "矣乎",
        "矣哉",
        "离",
        "秒",
        "称",
        "竟而",
        "第",
        "等",
        "等到",
        "等等",
        "简言之",
        "管",
        "类如",
        "紧接着",
        "纵",
        "纵令",
        "纵使",
        "纵然",
        "经",
        "经过",
        "结果",
        "给",
        "继之",
        "继后",
        "继而",
        "综上所述",
        "罢了",
        "者",
        "而",
        "而且",
        "而况",
        "而后",
        "而外",
        "而已",
        "而是",
        "而言",
        "能",
        "能否",
        "腾",
        "自",
        "自个儿",
        "自从",
        "自各儿",
        "自后",
        "自家",
        "自己",
        "自打",
        "自身",
        "至",
        "至于",
        "至今",
        "至若",
        "致",
        "般的",
        "若",
        "若夫",
        "若是",
        "若果",
        "若非",
        "莫不然",
        "莫如",
        "莫若",
        "虽",
        "虽则",
        "虽然",
        "虽说",
        "被",
        "要",
        "要不",
        "要不是",
        "要不然",
        "要么",
        "要是",
        "譬喻",
        "譬如",
        "让",
        "许多",
        "论",
        "设使",
        "设或",
        "设若",
        "诚如",
        "诚然",
        "该",
        "说",
        "说来",
        "请",
        "诸",
        "诸位",
        "诸如",
        "谁",
        "谁人",
        "谁料",
        "谁知",
        "贼死",
        "赖以",
        "赶",
        "起",
        "起见",
        "趁",
        "趁着",
        "越是",
        "距",
        "跟",
        "较",
        "较之",
        "边",
        "过",
        "还",
        "还是",
        "还有",
        "还要",
        "这",
        "这一来",
        "这个",
        "这么",
        "这么些",
        "这么样",
        "这么点儿",
        "这些",
        "这会儿",
        "这儿",
        "这就是说",
        "这时",
        "这样",
        "这次",
        "这般",
        "这边",
        "这里",
        "进而",
        "连",
        "连同",
        "逐步",
        "通过",
        "遵循",
        "遵照",
        "那",
        "那个",
        "那么",
        "那么些",
        "那么样",
        "那些",
        "那会儿",
        "那儿",
        "那时",
        "那样",
        "那般",
        "那边",
        "那里",
        "都",
        "鄙人",
        "鉴于",
        "针对",
        "阿",
        "除",
        "除了",
        "除外",
        "除开",
        "除此之外",
        "除非",
        "随",
        "随后",
        "随时",
        "随着",
        "难道说",
        "零",
        "非",
        "非但",
        "非徒",
        "非特",
        "非独",
        "靠",
        "顺",
        "顺着",
        "首先",
        "︿",
        "!",
        "#",
        "$",
        "%",
        "&",
        "(",
        ")",
        "*",
        "+",
        ",",
        "0",
        "1",
        "2",
        "3",
        "4",
        "5",
        "6",
        "7",
        "8",
        "9",
        ":",
        ";",
        "<",
        ">",
        "?",
        "@",
        "[",
        "]",
        "{",
        "|",
        "}",
        "~",
        "¥",
    ]
)
stopwords = stopwords.union(set(string.punctuation))
constraints = [RepeatModification(), StopwordModification(stopwords=stopwords)]

# search method
search_method = GreedyWordSwapWIR(wir_method="weighted-saliency")

# attack!
attack = Attack(goal_function, constraints, transformation, search_method)
attack_args = AttackArgs(num_examples=20)
attacker = Attacker(attack, dataset, attack_args)
attack_results = attacker.attack_dataset()
Attack(
  (search_method): GreedyWordSwapWIR(
    (wir_method):  weighted-saliency
  )
  (goal_function):  UntargetedClassification
  (transformation):  ChineseWordSwapMaskedLM
  (constraints):
    (0): RepeatModification
    (1): StopwordModification
  (is_black_box):  True
)


  0%|          | 0/20 [00:00<?, ?it/s]
  5%|▌         | 1/20 [03:10<1:00:26, 190.86s/it]
[Succeeded / Failed / Skipped / Total] 0 / 1 / 0 / 1:   5%|▌         | 1/20 [03:10<1:00:26, 190.86s/it]
--------------------------------------------- Result 1 ---------------------------------------------
[[Sports (100%)]] --> [[[FAILED]]]

林书豪新秀赛上甘心"跑龙套" 自称仍是底薪球员



[Succeeded / Failed / Skipped / Total] 0 / 1 / 0 / 1:  10%|█         | 2/20 [06:55<1:02:18, 207.69s/it]
[Succeeded / Failed / Skipped / Total] 0 / 2 / 0 / 2:  10%|█         | 2/20 [06:55<1:02:18, 207.70s/it]
--------------------------------------------- Result 2 ---------------------------------------------
[[Culture (100%)]] --> [[[FAILED]]]

成都现“真人图书馆”:无书“借人”给你读



[Succeeded / Failed / Skipped / Total] 0 / 2 / 0 / 2:  15%|█▌        | 3/20 [07:01<39:50, 140.61s/it]
[Succeeded / Failed / Skipped / Total] 0 / 2 / 1 / 3:  15%|█▌        | 3/20 [07:01<39:50, 140.61s/it]
--------------------------------------------- Result 3 ---------------------------------------------
[[Mainland china politics (57%)]] --> [[[SKIPPED]]]

中国经济走向更趋稳健务实



[Succeeded / Failed / Skipped / Total] 0 / 2 / 1 / 3:  20%|██        | 4/20 [11:33<46:12, 173.28s/it]
[Succeeded / Failed / Skipped / Total] 0 / 3 / 1 / 4:  20%|██        | 4/20 [11:33<46:12, 173.28s/it]
--------------------------------------------- Result 4 ---------------------------------------------
[[Sports (100%)]] --> [[[FAILED]]]

国际田联世界挑战赛 罗伯斯迎来赛季第三冠



[Succeeded / Failed / Skipped / Total] 0 / 3 / 1 / 4:  25%|██▌       | 5/20 [14:52<44:36, 178.44s/it]
--------------------------------------------- Result 5 ---------------------------------------------

[Succeeded / Failed / Skipped / Total] 1 / 3 / 1 / 5:  25%|██▌       | 5/20 [14:53<44:39, 178.62s/it]
[[International news (66%)]] --> [[Entertainment (68%)]]

德国一电视台合成“默克尔头巾照”惹争议

德国一电视台合成“性感头巾照”惹争议



[Succeeded / Failed / Skipped / Total] 1 / 3 / 1 / 5:  30%|███       | 6/20 [14:57<34:55, 149.65s/it]
[Succeeded / Failed / Skipped / Total] 1 / 3 / 2 / 6:  30%|███       | 6/20 [14:57<34:55, 149.65s/it]
--------------------------------------------- Result 6 ---------------------------------------------
[[Mainland china politics (80%)]] --> [[[SKIPPED]]]

朴槿惠今访华 韩媒称访西安可能为增进与习近平友谊



[Succeeded / Failed / Skipped / Total] 1 / 3 / 2 / 6:  35%|███▌      | 7/20 [15:04<27:59, 129.16s/it]
[Succeeded / Failed / Skipped / Total] 1 / 3 / 3 / 7:  35%|███▌      | 7/20 [15:04<27:59, 129.16s/it]
--------------------------------------------- Result 7 ---------------------------------------------
[[Mainland china politics (59%)]] --> [[[SKIPPED]]]

中国驻休斯敦总领馆举办春节招待会向华裔拜年



[Succeeded / Failed / Skipped / Total] 1 / 3 / 3 / 7:  40%|████      | 8/20 [15:08<22:43, 113.60s/it]
[Succeeded / Failed / Skipped / Total] 1 / 3 / 4 / 8:  40%|████      | 8/20 [15:08<22:43, 113.61s/it]
--------------------------------------------- Result 8 ---------------------------------------------
[[Culture (93%)]] --> [[[SKIPPED]]]

NASA发现“地球兄弟” 具备生命存活条件



[Succeeded / Failed / Skipped / Total] 1 / 3 / 4 / 8:  45%|████▌     | 9/20 [15:13<18:36, 101.52s/it]
[Succeeded / Failed / Skipped / Total] 1 / 3 / 5 / 9:  45%|████▌     | 9/20 [15:13<18:36, 101.52s/it]
--------------------------------------------- Result 9 ---------------------------------------------
[[Culture (53%)]] --> [[[SKIPPED]]]

儿子去世后社交网站账号停用 父亲请求保留记忆



[Succeeded / Failed / Skipped / Total] 1 / 3 / 5 / 9:  50%|█████     | 10/20 [18:20<18:20, 110.06s/it]
[Succeeded / Failed / Skipped / Total] 2 / 3 / 5 / 10:  50%|█████     | 10/20 [18:20<18:20, 110.06s/it]
--------------------------------------------- Result 10 ---------------------------------------------
[[Culture (100%)]] --> [[Entertainment (72%)]]

第六届鲁迅文学奖颁发 格非等35位获奖者领奖

第六届决赛颁发 格非等35位获奖者领奖



[Succeeded / Failed / Skipped / Total] 2 / 3 / 5 / 10:  55%|█████▌    | 11/20 [22:44<18:36, 124.02s/it]
[Succeeded / Failed / Skipped / Total] 3 / 3 / 5 / 11:  55%|█████▌    | 11/20 [22:44<18:36, 124.02s/it]
--------------------------------------------- Result 11 ---------------------------------------------
[[Hong kong - macau politics (96%)]] --> [[Culture (79%)]]

东莞台商欲借“台博会”搭建内销平台

东莞讯欲借“艺博会”搭建内销平台



[Succeeded / Failed / Skipped / Total] 3 / 3 / 5 / 11:  60%|██████    | 12/20 [22:48<15:12, 114.07s/it]
[Succeeded / Failed / Skipped / Total] 3 / 3 / 6 / 12:  60%|██████    | 12/20 [22:48<15:12, 114.07s/it]
--------------------------------------------- Result 12 ---------------------------------------------
[[Financial news (56%)]] --> [[[SKIPPED]]]

日本网友买扇贝当下酒菜 发现内有真正珍珠(图)



[Succeeded / Failed / Skipped / Total] 3 / 3 / 6 / 12:  65%|██████▌   | 13/20 [28:59<15:36, 133.78s/it]
[Succeeded / Failed / Skipped / Total] 3 / 4 / 6 / 13:  65%|██████▌   | 13/20 [28:59<15:36, 133.78s/it]
--------------------------------------------- Result 13 ---------------------------------------------
[[Sports (100%)]] --> [[[FAILED]]]

篮球热潮席卷张江 NBA中投王与拉拉队鼎力加盟



[Succeeded / Failed / Skipped / Total] 3 / 4 / 6 / 13:  70%|███████   | 14/20 [33:40<14:26, 144.34s/it]
[Succeeded / Failed / Skipped / Total] 3 / 5 / 6 / 14:  70%|███████   | 14/20 [33:40<14:26, 144.34s/it]
--------------------------------------------- Result 14 ---------------------------------------------
[[Sports (100%)]] --> [[[FAILED]]]

UFC终极格斗冠军赛开打 "草原狼"遭遇三连败



[Succeeded / Failed / Skipped / Total] 3 / 5 / 6 / 14:  75%|███████▌  | 15/20 [33:45<11:15, 135.04s/it]
[Succeeded / Failed / Skipped / Total] 3 / 5 / 7 / 15:  75%|███████▌  | 15/20 [33:45<11:15, 135.04s/it]
--------------------------------------------- Result 15 ---------------------------------------------
[[Culture (92%)]] --> [[[SKIPPED]]]

水果style:心形水果惹人爱 骰子西瓜乐趣多(图)



[Succeeded / Failed / Skipped / Total] 3 / 5 / 7 / 15:  80%|████████  | 16/20 [40:09<10:02, 150.60s/it]
[Succeeded / Failed / Skipped / Total] 3 / 6 / 7 / 16:  80%|████████  | 16/20 [40:09<10:02, 150.60s/it]
--------------------------------------------- Result 16 ---------------------------------------------
[[Sports (100%)]] --> [[[FAILED]]]

同里杯中国天元赛前瞻:芈昱廷李钦诚争挑战权



[Succeeded / Failed / Skipped / Total] 3 / 6 / 7 / 16:  85%|████████▌ | 17/20 [43:32<07:41, 153.67s/it]
[Succeeded / Failed / Skipped / Total] 4 / 6 / 7 / 17:  85%|████████▌ | 17/20 [43:32<07:41, 153.67s/it]
--------------------------------------------- Result 17 ---------------------------------------------
[[Entertainment (100%)]] --> [[Financial news (99%)]]

桂纶镁为戏体验生活 东北洗衣店当店员

桂纶品牌为首体验生活 东北洗衣店当家



[Succeeded / Failed / Skipped / Total] 4 / 6 / 7 / 17:  90%|█████████ | 18/20 [44:01<04:53, 146.75s/it]
[Succeeded / Failed / Skipped / Total] 4 / 7 / 7 / 18:  90%|█████████ | 18/20 [44:01<04:53, 146.75s/it]
--------------------------------------------- Result 18 ---------------------------------------------
[[Culture (95%)]] --> [[[FAILED]]]

河南羲皇故都朝祖会流传6000年 一天游客80万人



[Succeeded / Failed / Skipped / Total] 4 / 7 / 7 / 18:  95%|█████████▌| 19/20 [44:07<02:19, 139.35s/it]
[Succeeded / Failed / Skipped / Total] 4 / 7 / 8 / 19:  95%|█████████▌| 19/20 [44:07<02:19, 139.35s/it]
--------------------------------------------- Result 19 ---------------------------------------------
[[Culture (92%)]] --> [[[SKIPPED]]]

辛柏青谈追求妻子:用1袋洗衣粉、2块肥皂打动她的



[Succeeded / Failed / Skipped / Total] 4 / 7 / 8 / 19: 100%|██████████| 20/20 [49:19<00:00, 147.96s/it]
[Succeeded / Failed / Skipped / Total] 5 / 7 / 8 / 20: 100%|██████████| 20/20 [49:19<00:00, 147.96s/it]
--------------------------------------------- Result 20 ---------------------------------------------
[[International news (100%)]] --> [[Mainland china politics (66%)]]

朝鲜谴责韩国前方部队打出反朝口号

中国谴责日本前方部队打出侵略口号



+-------------------------------+--------+
| Attack Results                |        |
+-------------------------------+--------+
| Number of successful attacks: | 5      |
| Number of failed attacks:     | 7      |
| Number of skipped attacks:    | 8      |
| Original accuracy:            | 60.0%  |
| Accuracy under attack:        | 35.0%  |
| Attack success rate:          | 41.67% |
| Average perturbed word %:     | 36.39% |
| Average num. words per input: | 9.3    |
| Avg num queries:              | 45.5   |
+-------------------------------+--------+

As aforementioned, we can also augment Chinese sentences with the provided transformation. A quick examples is shown below:

[ ]:
from textattack.constraints.pre_transformation import RepeatModification
from textattack.constraints.pre_transformation import StopwordModification
from textattack.augmentation import Augmenter

# transformation
transformation = ChineseMorphonymCharacterSwap()

# constraints
constraints = [RepeatModification(), StopwordModification()]

# Create augmenter with specified parameters
augmenter = Augmenter(
    transformation=transformation, pct_words_to_swap=0.1, transformations_per_example=2
)
s = "听见树林的呢喃,发现溪流中的知识。"

# Augment!
augmenter.augment(s)
Building prefix dict from the default dictionary ...
DEBUG:jieba:Building prefix dict from the default dictionary ...
Dumping model to file cache /tmp/jieba.cache
DEBUG:jieba:Dumping model to file cache /tmp/jieba.cache
Loading model cost 0.888 seconds.
DEBUG:jieba:Loading model cost 0.888 seconds.
Prefix dict has been built successfully.
DEBUG:jieba:Prefix dict has been built successfully.
['听见树林的呢喃,发现溪流中的知织。', '听见树林的呢喃,发视溪流中的知识。']