Attack Recipes API

We provide a number of pre-built attack recipes, which correspond to attacks from the literature. To run an attack recipe from the command line, run:

textattack attack --recipe [recipe_name]

To initialize an attack in Python script, use:

<recipe name>.build(model_wrapper)

For example, attack = InputReductionFeng2018.build(model) creates attack, an object of type Attack with the goal function, transformation, constraints, and search method specified in that paper. This object can then be used just like any other attack; for example, by calling attack.attack_dataset.

TextAttack supports the following attack recipes (each recipe’s documentation contains a link to the corresponding paper):

Attacks on classification models 

A2T (Towards Improving Adversarial Training of NLP Models” (Yoo et al., 2021))
Alzantot Genetic Algorithm (Generating Natural Language Adversarial Examples)
Faster Alzantot Genetic Algorithm (Certified Robustness to Adversarial Word Substitutions)
BAE (BAE: BERT-Based Adversarial Examples)
BERT-Attack: (BERT-Attack: Adversarial Attack Against BERT Using BERT)
CheckList: (Beyond Accuracy: Behavioral Testing of NLP models with CheckList)
DeepWordBug (Black-box Generation of Adversarial Text Sequences to Evade Deep Learning Classifiers)
HotFlip (HotFlip: White-Box Adversarial Examples for Text Classification)
Improved Genetic Algorithm (Natural Language Adversarial Attacks and Defenses in Word Level)
Input Reduction (Pathologies of Neural Models Make Interpretations Difficult)
Kuleshov (Adversarial Examples for Natural Language Classification Problems)
Particle Swarm Optimization (Word-level Textual Adversarial Attacking as Combinatorial Optimization)
PWWS (Generating Natural Language Adversarial Examples through Probability Weighted Word Saliency)
TextFooler (Is BERT Really Robust? A Strong Baseline for Natural Language Attack on Text Classification and Entailment)
TextBugger (TextBugger: Generating Adversarial Text Against Real-world Applications)
Pruthi (Combating Adversarial Misspellings with Robust Word Recognition 2019)
CLARE (Contextualized Perturbation for Textual Adversarial Attack 2020)

A2T (A2T: Attack for Adversarial Training Recipe)

class textattack.attack_recipes.a2t_yoo_2021.A2TYoo2021(goal_function: GoalFunction, constraints: List[Constraint | PreTransformationConstraint], transformation: Transformation, search_method: SearchMethod, transformation_cache_size=32768, constraint_cache_size=32768)[source]

Towards Improving Adversarial Training of NLP Models.

(Yoo et al., 2021)

https://arxiv.org/abs/2109.00544

static build(model_wrapper, mlm=False)[source]

Build attack recipe.

Parameters:

model_wrapper (ModelWrapper) – Model wrapper containing both the model and the tokenizer.
mlm (bool, optional, defaults to False) – If True, load A2T-MLM attack. Otherwise, load regular A2T attack.

Returns:

A2T attack.

Return type:

Attack

Alzantot Genetic Algorithm 

(Generating Natural Language Adversarial Examples)

Warning

This attack uses a very slow language model. Consider using the faster-alzantot recipe instead.

class textattack.attack_recipes.genetic_algorithm_alzantot_2018.GeneticAlgorithmAlzantot2018(goal_function: GoalFunction, constraints: List[Constraint | PreTransformationConstraint], transformation: Transformation, search_method: SearchMethod, transformation_cache_size=32768, constraint_cache_size=32768)[source]

Alzantot, M., Sharma, Y., Elgohary, A., Ho, B., Srivastava, M.B., & Chang, K. (2018).

Generating Natural Language Adversarial Examples.

https://arxiv.org/abs/1804.07998

static build(model_wrapper)[source]

Creates pre-built Attack that correspond to attacks from the literature.

Parameters:

model_wrapper (ModelWrapper) – ModelWrapper that contains the victim model and tokenizer. This is passed to GoalFunction when constructing the attack.
kwargs – Additional keyword arguments.

Returns:

Attack

Faster Alzantot Genetic Algorithm 

(Certified Robustness to Adversarial Word Substitutions)

class textattack.attack_recipes.faster_genetic_algorithm_jia_2019.FasterGeneticAlgorithmJia2019(goal_function: GoalFunction, constraints: List[Constraint | PreTransformationConstraint], transformation: Transformation, search_method: SearchMethod, transformation_cache_size=32768, constraint_cache_size=32768)[source]

Certified Robustness to Adversarial Word Substitutions.

Robin Jia, Aditi Raghunathan, Kerem Göksel, Percy Liang (2019).

https://arxiv.org/pdf/1909.00986.pdf

static build(model_wrapper)[source]

Creates pre-built Attack that correspond to attacks from the literature.

Parameters:

model_wrapper (ModelWrapper) – ModelWrapper that contains the victim model and tokenizer. This is passed to GoalFunction when constructing the attack.
kwargs – Additional keyword arguments.

Returns:

Attack

BAE (BAE: BERT-Based Adversarial Examples)

class textattack.attack_recipes.bae_garg_2019.BAEGarg2019(goal_function: GoalFunction, constraints: List[Constraint | PreTransformationConstraint], transformation: Transformation, search_method: SearchMethod, transformation_cache_size=32768, constraint_cache_size=32768)[source]

Siddhant Garg and Goutham Ramakrishnan, 2019.

BAE: BERT-based Adversarial Examples for Text Classification.

https://arxiv.org/pdf/2004.01970

This is “attack mode” 1 from the paper, BAE-R, word replacement.

We present 4 attack modes for BAE based on the: R and I operations, where for each token t in S: • BAE-R: Replace token t (See Algorithm 1) • BAE-I: Insert a token to the left or right of t • BAE-R/I: Either replace token t or insert a token to the left or right of t • BAE-R+I: First replace token t, then insert a token to the left or right of t

static build(model_wrapper)[source]

Creates pre-built Attack that correspond to attacks from the literature.

Parameters:

model_wrapper (ModelWrapper) – ModelWrapper that contains the victim model and tokenizer. This is passed to GoalFunction when constructing the attack.
kwargs – Additional keyword arguments.

Returns:

Attack

BERT-Attack:

(BERT-Attack: Adversarial Attack Against BERT Using BERT)

Warning

This attack is super slow (see https://github.com/QData/TextAttack/issues/586) Consider using smaller values for “max_candidates”.

class textattack.attack_recipes.bert_attack_li_2020.BERTAttackLi2020(goal_function: GoalFunction, constraints: List[Constraint | PreTransformationConstraint], transformation: Transformation, search_method: SearchMethod, transformation_cache_size=32768, constraint_cache_size=32768)[source]

Li, L.., Ma, R., Guo, Q., Xiangyang, X., Xipeng, Q. (2020).

BERT-ATTACK: Adversarial Attack Against BERT Using BERT

https://arxiv.org/abs/2004.09984

This is “attack mode” 1 from the paper, BAE-R, word replacement.

static build(model_wrapper)[source]

Creates pre-built Attack that correspond to attacks from the literature.

Parameters:

model_wrapper (ModelWrapper) – ModelWrapper that contains the victim model and tokenizer. This is passed to GoalFunction when constructing the attack.
kwargs – Additional keyword arguments.

Returns:

Attack

CheckList:

(Beyond Accuracy: Behavioral Testing of NLP models with CheckList)

class textattack.attack_recipes.checklist_ribeiro_2020.CheckList2020(goal_function: GoalFunction, constraints: List[Constraint | PreTransformationConstraint], transformation: Transformation, search_method: SearchMethod, transformation_cache_size=32768, constraint_cache_size=32768)[source]

An implementation of the attack used in “Beyond Accuracy: Behavioral Testing of NLP models with CheckList”, Ribeiro et al., 2020.

This attack focuses on a number of attacks used in the Invariance Testing Method: Contraction, Extension, Changing Names, Number, Location

https://arxiv.org/abs/2005.04118

static build(model_wrapper)[source]

Creates pre-built Attack that correspond to attacks from the literature.

Parameters:

model_wrapper (ModelWrapper) – ModelWrapper that contains the victim model and tokenizer. This is passed to GoalFunction when constructing the attack.
kwargs – Additional keyword arguments.

Returns:

Attack

DeepWordBug 

(Black-box Generation of Adversarial Text Sequences to Evade Deep Learning Classifiers)

class textattack.attack_recipes.deepwordbug_gao_2018.DeepWordBugGao2018(goal_function: GoalFunction, constraints: List[Constraint | PreTransformationConstraint], transformation: Transformation, search_method: SearchMethod, transformation_cache_size=32768, constraint_cache_size=32768)[source]

Gao, Lanchantin, Soffa, Qi.

Black-box Generation of Adversarial Text Sequences to Evade Deep Learning Classifiers.

https://arxiv.org/abs/1801.04354

static build(model_wrapper, use_all_transformations=True)[source]

Creates pre-built Attack that correspond to attacks from the literature.

Parameters:

model_wrapper (ModelWrapper) – ModelWrapper that contains the victim model and tokenizer. This is passed to GoalFunction when constructing the attack.
kwargs – Additional keyword arguments.

Returns:

Attack

HotFlip 

(HotFlip: White-Box Adversarial Examples for Text Classification)

class textattack.attack_recipes.hotflip_ebrahimi_2017.HotFlipEbrahimi2017(goal_function: GoalFunction, constraints: List[Constraint | PreTransformationConstraint], transformation: Transformation, search_method: SearchMethod, transformation_cache_size=32768, constraint_cache_size=32768)[source]

Ebrahimi, J. et al. (2017)

HotFlip: White-Box Adversarial Examples for Text Classification

https://arxiv.org/abs/1712.06751

This is a reproduction of the HotFlip word-level attack (section 5 of the paper).

static build(model_wrapper)[source]

Creates pre-built Attack that correspond to attacks from the literature.

Parameters:

model_wrapper (ModelWrapper) – ModelWrapper that contains the victim model and tokenizer. This is passed to GoalFunction when constructing the attack.
kwargs – Additional keyword arguments.

Returns:

Attack

Improved Genetic Algorithm 

(Natural Language Adversarial Attacks and Defenses in Word Level)

class textattack.attack_recipes.iga_wang_2019.IGAWang2019(goal_function: GoalFunction, constraints: List[Constraint | PreTransformationConstraint], transformation: Transformation, search_method: SearchMethod, transformation_cache_size=32768, constraint_cache_size=32768)[source]

Xiaosen Wang, Hao Jin, Kun He (2019).

Natural Language Adversarial Attack and Defense in Word Level.

http://arxiv.org/abs/1909.06723

static build(model_wrapper)[source]

Creates pre-built Attack that correspond to attacks from the literature.

Parameters:

model_wrapper (ModelWrapper) – ModelWrapper that contains the victim model and tokenizer. This is passed to GoalFunction when constructing the attack.
kwargs – Additional keyword arguments.

Returns:

Attack

Input Reduction 

(Pathologies of Neural Models Make Interpretations Difficult)

class textattack.attack_recipes.input_reduction_feng_2018.InputReductionFeng2018(goal_function: GoalFunction, constraints: List[Constraint | PreTransformationConstraint], transformation: Transformation, search_method: SearchMethod, transformation_cache_size=32768, constraint_cache_size=32768)[source]

Feng, Wallace, Grissom, Iyyer, Rodriguez, Boyd-Graber. (2018).

Pathologies of Neural Models Make Interpretations Difficult.

https://arxiv.org/abs/1804.07781

static build(model_wrapper)[source]

Creates pre-built Attack that correspond to attacks from the literature.

Parameters:

model_wrapper (ModelWrapper) – ModelWrapper that contains the victim model and tokenizer. This is passed to GoalFunction when constructing the attack.
kwargs – Additional keyword arguments.

Returns:

Attack

Kuleshov2017 

(Adversarial Examples for Natural Language Classification Problems)

class textattack.attack_recipes.kuleshov_2017.Kuleshov2017(goal_function: GoalFunction, constraints: List[Constraint | PreTransformationConstraint], transformation: Transformation, search_method: SearchMethod, transformation_cache_size=32768, constraint_cache_size=32768)[source]

Kuleshov, V. et al.

Generating Natural Language Adversarial Examples.

https://openreview.net/pdf?id=r1QZ3zbAZ.

static build(model_wrapper)[source]

Creates pre-built Attack that correspond to attacks from the literature.

Parameters:

model_wrapper (ModelWrapper) – ModelWrapper that contains the victim model and tokenizer. This is passed to GoalFunction when constructing the attack.
kwargs – Additional keyword arguments.

Returns:

Attack

Particle Swarm Optimization 

(Word-level Textual Adversarial Attacking as Combinatorial Optimization)

class textattack.attack_recipes.pso_zang_2020.PSOZang2020(goal_function: GoalFunction, constraints: List[Constraint | PreTransformationConstraint], transformation: Transformation, search_method: SearchMethod, transformation_cache_size=32768, constraint_cache_size=32768)[source]

Zang, Y., Yang, C., Qi, F., Liu, Z., Zhang, M., Liu, Q., & Sun, M. (2019).

Word-level Textual Adversarial Attacking as Combinatorial Optimization.

https://www.aclweb.org/anthology/2020.acl-main.540.pdf

Methodology description quoted from the paper:

“We propose a novel word substitution-based textual attack model, which reforms both the aforementioned two steps. In the first step, we adopt a sememe-based word substitution strategy, which can generate more candidate adversarial examples with better semantic preservation. In the second step, we utilize particle swarm optimization (Eberhart and Kennedy, 1995) as the adversarial example searching algorithm.”

And “Following the settings in Alzantot et al. (2018), we set the max iteration time G to 20.”

static build(model_wrapper)[source]

Creates pre-built Attack that correspond to attacks from the literature.

Parameters:

model_wrapper (ModelWrapper) – ModelWrapper that contains the victim model and tokenizer. This is passed to GoalFunction when constructing the attack.
kwargs – Additional keyword arguments.

Returns:

Attack

PWWS 

(Generating Natural Language Adversarial Examples through Probability Weighted Word Saliency)

class textattack.attack_recipes.pwws_ren_2019.PWWSRen2019(goal_function: GoalFunction, constraints: List[Constraint | PreTransformationConstraint], transformation: Transformation, search_method: SearchMethod, transformation_cache_size=32768, constraint_cache_size=32768)[source]

An implementation of Probability Weighted Word Saliency from “Generating Natural Language Adversarial Examples through Probability Weighted Word Saliency”, Ren et al., 2019.

Words are prioritized for a synonym-swap transformation based on a combination of their saliency score and maximum word-swap effectiveness. Note that this implementation does not include the Named Entity adversarial swap from the original paper, because it requires access to the full dataset and ground truth labels in advance.

https://www.aclweb.org/anthology/P19-1103/

static build(model_wrapper)[source]

Creates pre-built Attack that correspond to attacks from the literature.

Parameters:

model_wrapper (ModelWrapper) – ModelWrapper that contains the victim model and tokenizer. This is passed to GoalFunction when constructing the attack.
kwargs – Additional keyword arguments.

Returns:

Attack

TextFooler (Is BERT Really Robust?)

A Strong Baseline for Natural Language Attack on Text Classification and Entailment)

class textattack.attack_recipes.textfooler_jin_2019.TextFoolerJin2019(goal_function: GoalFunction, constraints: List[Constraint | PreTransformationConstraint], transformation: Transformation, search_method: SearchMethod, transformation_cache_size=32768, constraint_cache_size=32768)[source]

Jin, D., Jin, Z., Zhou, J.T., & Szolovits, P. (2019).

Is BERT Really Robust? Natural Language Attack on Text Classification and Entailment.

https://arxiv.org/abs/1907.11932

static build(model_wrapper)[source]

Creates pre-built Attack that correspond to attacks from the literature.

Parameters:

model_wrapper (ModelWrapper) – ModelWrapper that contains the victim model and tokenizer. This is passed to GoalFunction when constructing the attack.
kwargs – Additional keyword arguments.

Returns:

Attack

TextBugger 

(TextBugger: Generating Adversarial Text Against Real-world Applications)

class textattack.attack_recipes.textbugger_li_2018.TextBuggerLi2018(goal_function: GoalFunction, constraints: List[Constraint | PreTransformationConstraint], transformation: Transformation, search_method: SearchMethod, transformation_cache_size=32768, constraint_cache_size=32768)[source]

Li, J., Ji, S., Du, T., Li, B., and Wang, T. (2018).

TextBugger: Generating Adversarial Text Against Real-world Applications.

https://arxiv.org/abs/1812.05271

static build(model_wrapper)[source]

Creates pre-built Attack that correspond to attacks from the literature.

Parameters:

model_wrapper (ModelWrapper) – ModelWrapper that contains the victim model and tokenizer. This is passed to GoalFunction when constructing the attack.
kwargs – Additional keyword arguments.

Returns:

Attack

CLARE Recipe 

(Contextualized Perturbation for Textual Adversarial Attack)

class textattack.attack_recipes.clare_li_2020.CLARE2020(goal_function: GoalFunction, constraints: List[Constraint | PreTransformationConstraint], transformation: Transformation, search_method: SearchMethod, transformation_cache_size=32768, constraint_cache_size=32768)[source]

Li, Zhang, Peng, Chen, Brockett, Sun, Dolan.

“Contextualized Perturbation for Textual Adversarial Attack” (Li et al., 2020)

https://arxiv.org/abs/2009.07502

This method uses greedy search with replace, merge, and insertion transformations that leverage a pretrained language model. It also uses USE similarity constraint.

static build(model_wrapper)[source]

Creates pre-built Attack that correspond to attacks from the literature.

Parameters:

model_wrapper (ModelWrapper) – ModelWrapper that contains the victim model and tokenizer. This is passed to GoalFunction when constructing the attack.
kwargs – Additional keyword arguments.

Returns:

Attack

Pruthi2019: Combating with Robust Word Recognition 

class textattack.attack_recipes.pruthi_2019.Pruthi2019(goal_function: GoalFunction, constraints: List[Constraint | PreTransformationConstraint], transformation: Transformation, search_method: SearchMethod, transformation_cache_size=32768, constraint_cache_size=32768)[source]

An implementation of the attack used in “Combating Adversarial Misspellings with Robust Word Recognition”, Pruthi et al., 2019.

This attack focuses on a small number of character-level changes that simulate common typos. It combines:

Swapping neighboring characters
Deleting characters
Inserting characters
Swapping characters for adjacent keys on a QWERTY keyboard.

https://arxiv.org/abs/1905.11268

Parameters:

model – Model to attack.
max_num_word_swaps – Maximum number of modifications to allow.

static build(model_wrapper, max_num_word_swaps=1)[source]

Creates pre-built Attack that correspond to attacks from the literature.

Parameters:

model_wrapper (ModelWrapper) – ModelWrapper that contains the victim model and tokenizer. This is passed to GoalFunction when constructing the attack.
kwargs – Additional keyword arguments.

Returns:

Attack

Attacks on sequence-to-sequence models 

MORPHEUS (It’s Morphin’ Time! Combating Linguistic Discrimination with Inflectional Perturbations)
Seq2Sick (Seq2Sick: Evaluating the Robustness of Sequence-to-Sequence Models with Adversarial Examples)

MORPHEUS2020 

(It’s Morphin’ Time! Combating Linguistic Discrimination with Inflectional Perturbations)

class textattack.attack_recipes.morpheus_tan_2020.MorpheusTan2020(goal_function: GoalFunction, constraints: List[Constraint | PreTransformationConstraint], transformation: Transformation, search_method: SearchMethod, transformation_cache_size=32768, constraint_cache_size=32768)[source]

Samson Tan, Shafiq Joty, Min-Yen Kan, Richard Socher.

It’s Morphin’ Time! Combating Linguistic Discrimination with Inflectional Perturbations

https://www.aclweb.org/anthology/2020.acl-main.263/

static build(model_wrapper)[source]

Creates pre-built Attack that correspond to attacks from the literature.

Parameters:

model_wrapper (ModelWrapper) – ModelWrapper that contains the victim model and tokenizer. This is passed to GoalFunction when constructing the attack.
kwargs – Additional keyword arguments.

Returns:

Attack

Seq2Sick 

(Seq2Sick: Evaluating the Robustness of Sequence-to-Sequence Models with Adversarial Examples)

class textattack.attack_recipes.seq2sick_cheng_2018_blackbox.Seq2SickCheng2018BlackBox(goal_function: GoalFunction, constraints: List[Constraint | PreTransformationConstraint], transformation: Transformation, search_method: SearchMethod, transformation_cache_size=32768, constraint_cache_size=32768)[source]

Cheng, Minhao, et al.

Seq2Sick: Evaluating the Robustness of Sequence-to-Sequence Models with Adversarial Examples

https://arxiv.org/abs/1803.01128

This is a greedy re-implementation of the seq2sick attack method. It does not use gradient descent.

static build(model_wrapper, goal_function='non_overlapping')[source]

Creates pre-built Attack that correspond to attacks from the literature.

Parameters:

model_wrapper (ModelWrapper) – ModelWrapper that contains the victim model and tokenizer. This is passed to GoalFunction when constructing the attack.
kwargs – Additional keyword arguments.

Returns:

Attack

General 

BadCharacters (Bad Characters: Imperceptible NLP Attacks)

Imperceptible Perturbations Algorithm 

class textattack.attack_recipes.bad_characters_2021.BadCharacters2021(goal_function: GoalFunction, constraints: List[Constraint | PreTransformationConstraint], transformation: Transformation, search_method: SearchMethod, transformation_cache_size=32768, constraint_cache_size=32768)[source]

Imperceptible Perturbations Attack Recipe

Implements imperceptible adversarial attacks on NLP models as outlined in the Bad Characters paper.

This recipe combines imperceptible transformations with the Differential Evolution search method. It supports a variety of goal functions (targeted, untargeted, NER, translation) and several types of character-level perturbations.

Transformations supported:

WordSwapInvisibleCharacters: injects invisible Unicode characters
WordSwapHomoglyphSwap: replaces characters with homoglyphs
WordSwapDeletions: inserts deletion control characters
WordSwapReorderings: inserts reordering control characters

Goal functions supported:

TargetedClassification
TargetedStrict
TargetedBonus
LogitSum (for logits-based classifiers like toxic comment detection)
MinimizeBleu (translation BLEU score minimization)
MaximizeLevenshtein (translation Levenshtein distance maximization)

All transformations are compatible with all goal functions.

Note: This recipe assumes the model wrapper is compatible with the goal function chosen. For example, a NamedEntityRecognition goal function expects a model wrapper that outputs a list of dictionaries per input, while LogitSum expects an array of logits.

static build(model_wrapper, goal_function_type: str, perturbation_type: str | None = None, allow_skip: bool = False, perturbs=1, popsize=32, maxiter=10, **goal_function_kwargs)[source]

Builds an imperceptible attack instance.

Parameters:

model_wrapper (ModelWrapper) – A TextAttack model wrapper compatible with the selected goal function.
goal_function_type (str, optional) –
Goal function type. One of:
- "targeted_classification": targeted attack on a classification model (default).
- "targeted_strict": stricter targeted attack.
- "targeted_bonus": bonus if prediction for target class is highest.
- "named_entity_recognition": token-level NER attack.
- "logit_sum": untargeted attack minimizing total logits.
- "minimize_bleu": translation attack minimizing BLEU.
- "maximize_levenshtein": translation attack maximizing Levenshtein distance.
perturbation_type (str, optional) –
Type of character-level perturbation. One of:
- "homoglyphs" (default)
- "invisible"
- "deletions"
- "reorderings"
allow_skip (bool) – If False, the attack will continue even if the goal is already satisfied.
perturbs (int) – Maximum number of perturbations allowed per input string.
popsize (int) – Population size for differential evolution. Typically 32.
maxiter (int) – Maximum number of generations for differential evolution. Typically 10.
**goal_function_kwargs (dict) – Additional arguments passed to the goal function.

Returns:

Configured Attack instance.

Return type:

textattack.Attack