Augmenter Recipes API

Summary: Transformations and constraints can be used for simple NLP data augmentations.

In addition to the command-line interface, you can augment text dynamically by importing the Augmenter in your own code. All Augmenter objects implement augment and augment_many to generate augmentations of a string or a list of strings. Here’s an example of how to use the EmbeddingAugmenter in a python script:

>>> from textattack.augmentation import EmbeddingAugmenter
>>> augmenter = EmbeddingAugmenter()
>>> s = 'What I cannot create, I do not understand.'
>>> augmenter.augment(s)
['What I notable create, I do not understand.', 'What I significant create, I do not understand.', 'What I cannot engender, I do not understand.', 'What I cannot creating, I do not understand.', 'What I cannot creations, I do not understand.', 'What I cannot create, I do not comprehend.', 'What I cannot create, I do not fathom.', 'What I cannot create, I do not understanding.', 'What I cannot create, I do not understands.', 'What I cannot create, I do not understood.', 'What I cannot create, I do not realise.']

You can also create your own augmenter from scratch by importing transformations/constraints from textattack.transformations and textattack.constraints. Here’s an example that generates augmentations of a string using WordSwapRandomCharacterDeletion:

>>> from textattack.transformations import WordSwapRandomCharacterDeletion
>>> from textattack.transformations import CompositeTransformation
>>> from textattack.augmentation import Augmenter
>>> transformation = CompositeTransformation([WordSwapRandomCharacterDeletion()])
>>> augmenter = Augmenter(transformation=transformation, transformations_per_example=5)
>>> s = 'What I cannot create, I do not understand.'
>>> augmenter.augment(s)
['What I cannot creae, I do not understand.', 'What I cannot creat, I do not understand.', 'What I cannot create, I do not nderstand.', 'What I cannot create, I do nt understand.', 'Wht I cannot create, I do not understand.']

Here is a list of recipes for NLP data augmentations

Augmenter Recipes:

Transformations and constraints can be used for simple NLP data augmentations. Here is a list of recipes for NLP data augmentations

class textattack.augmentation.recipes.BackTranscriptionAugmenter(**kwargs)[source]

Sentence level augmentation that uses back transcription (TTS+ASR).

class textattack.augmentation.recipes.BackTranslationAugmenter(**kwargs)[source]

Sentence level augmentation that uses MarianMTModel to back-translate.

https://huggingface.co/transformers/model_doc/marian.html

class textattack.augmentation.recipes.CLAREAugmenter(model='distilroberta-base', tokenizer='distilroberta-base', **kwargs)[source]

Li, Zhang, Peng, Chen, Brockett, Sun, Dolan.

“Contextualized Perturbation for Textual Adversarial Attack” (Li et al., 2020)

https://arxiv.org/abs/2009.07502

CLARE builds on a pre-trained masked language model and modifies the inputs in a contextaware manner. We propose three contextualized perturbations, Replace, Insert and Merge, allowing for generating outputs of varied lengths.

class textattack.augmentation.recipes.CharSwapAugmenter(**kwargs)[source]

Augments words by swapping characters out for other characters.

class textattack.augmentation.recipes.CheckListAugmenter(**kwargs)[source]

Augments words by using the transformation methods provided by CheckList INV testing, which combines:

  • Name Replacement

  • Location Replacement

  • Number Alteration

  • Contraction/Extension

“Beyond Accuracy: Behavioral Testing of NLP models with CheckList” (Ribeiro et al., 2020) https://arxiv.org/abs/2005.04118

class textattack.augmentation.recipes.DeletionAugmenter(**kwargs)[source]
class textattack.augmentation.recipes.EasyDataAugmenter(pct_words_to_swap=0.1, transformations_per_example=4)[source]

An implementation of Easy Data Augmentation, which combines:

  • WordNet synonym replacement
    • Randomly replace words with their synonyms.

  • Word deletion
    • Randomly remove words from the sentence.

  • Word order swaps
    • Randomly swap the position of words in the sentence.

  • Random synonym insertion
    • Insert a random synonym of a random word at a random location.

in one augmentation method.

“EDA: Easy Data Augmentation Techniques for Boosting Performance on Text Classification Tasks” (Wei and Zou, 2019) https://arxiv.org/abs/1901.11196

augment(text)[source]

Returns all possible augmentations of text according to self.transformation.

class textattack.augmentation.recipes.EmbeddingAugmenter(**kwargs)[source]

Augments text by transforming words with their embeddings.

class textattack.augmentation.recipes.SwapAugmenter(**kwargs)[source]
class textattack.augmentation.recipes.SynonymInsertionAugmenter(**kwargs)[source]
class textattack.augmentation.recipes.WordNetAugmenter(**kwargs)[source]

Augments text by replacing with synonyms from the WordNet thesaurus.