textattack.transformations.sentence_transformations package

sentence_transformations package

BackTranslation class

class textattack.transformations.sentence_transformations.back_translation.BackTranslation(src_lang='en', target_lang='es', src_model='Helsinki-NLP/opus-mt-ROMANCE-en', target_model='Helsinki-NLP/opus-mt-en-ROMANCE', chained_back_translation=0)[source]

Bases: SentenceTransformation

A type of sentence level transformation that takes in a text input, translates it into target language and translates it back to source language.

letters_to_insert (string): letters allowed for insertion into words (used by some char-based transformations)

src_lang (string): source language target_lang (string): target language, for the list of supported language check bottom of this page src_model: translation model from huggingface that translates from source language to target language target_model: translation model from huggingface that translates from target language to source language chained_back_translation: run back translation in a chain for more perturbation (for example, en-es-en-fr-en)

Example:

>>> from textattack.transformations.sentence_transformations import BackTranslation
>>> from textattack.constraints.pre_transformation import RepeatModification, StopwordModification
>>> from textattack.augmentation import Augmenter

>>> transformation = BackTranslation()
>>> constraints = [RepeatModification(), StopwordModification()]
>>> augmenter = Augmenter(transformation = transformation, constraints = constraints)
>>> s = 'What on earth are you doing here.'

>>> augmenter.augment(s)
translate(input, model, tokenizer, lang='es')[source]

BackTranscription class

class textattack.transformations.sentence_transformations.back_transcription.BackTranscription(tts_model='facebook/fastspeech2-en-ljspeech', asr_model='openai/whisper-base')[source]

Bases: SentenceTransformation

A type of sentence level transformation that takes in a text input, converts it into synthesized speech using ASR, and transcribes it back to text using TTS.

tts_model: text-to-speech model from huggingface asr_model: automatic speech recognition model from huggingface

(!) Python libraries fairseq, g2p_en and librosa should be installed.

Example:

>>> from textattack.transformations.sentence_transformations import BackTranscription
>>> from textattack.constraints.pre_transformation import RepeatModification, StopwordModification
>>> from textattack.augmentation import Augmenter

>>> transformation = BackTranscription()
>>> constraints = [RepeatModification(), StopwordModification()]
>>> augmenter = Augmenter(transformation = transformation, constraints = constraints)
>>> s = 'What on earth are you doing here.'

>>> augmenter.augment(s)

You can find more about the back transcription method in the following paper:

@inproceedings{kubis-etal-2023-back,

title = “Back Transcription as a Method for Evaluating Robustness of Natural Language Understanding Models to Speech Recognition Errors”, author = “Kubis, Marek and

Sk{'o}rzewski, Pawe{l} and Sowa{'n}nski, Marcin and Zietkiewicz, Tomasz”,

editor = “Bouamor, Houda and

Pino, Juan and Bali, Kalika”,

booktitle = “Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing”, month = dec, year = “2023”, address = “Singapore”, publisher = “Association for Computational Linguistics”, url = “https://aclanthology.org/2023.emnlp-main.724”, doi = “10.18653/v1/2023.emnlp-main.724”, pages = “11824–11835”,

}

back_transcribe(text)[source]

SentenceTransformation class

https://github.com/makcedward/nlpaug

class textattack.transformations.sentence_transformations.sentence_transformation.SentenceTransformation[source]

Bases: Transformation