textattack.constraints.grammaticality package
Grammaticality:
Grammaticality constraints determine if a transformation is valid based on syntactic properties of the perturbation.
- textattack.constraints.grammaticality.language_models package
- non-pre Language Models:
- textattack.constraints.grammaticality.language_models.google_language_model package
- textattack.constraints.grammaticality.language_models.learning_to_write package
- GPT2 Language Models:
GPT2
- Language Models Constraint
LanguageModelConstraint
CoLA for Grammaticality
- class textattack.constraints.grammaticality.cola.COLA(max_diff, model_name='textattack/bert-base-uncased-CoLA', compare_against_original=True)[source]
Bases:
Constraint
Constrains an attack to text that has a similar number of linguistically accecptable sentences as the original text. Linguistic acceptability is determined by a model pre-trained on the CoLA dataset. By default a BERT model is used, see the pre- trained models README for a full list of available models or provide your own model from the huggingface model hub.
- Parameters:
max_diff (float or int) – The absolute (if int or greater than or equal to 1) or percent (if float and less than 1) maximum difference allowed between the number of valid sentences in the reference text and the number of valid sentences in the attacked text.
model_name (str) – The name of the pre-trained model to use for classification. The model must be in huggingface model hub.
compare_against_original (bool) – If True, compare against the original text. Otherwise, compare against the most recent text.
LanguageTool Grammar Checker
- class textattack.constraints.grammaticality.language_tool.LanguageTool(grammar_error_threshold=0, compare_against_original=True, language='en-US')[source]
Bases:
Constraint
Uses languagetool to determine if two sentences have the same number of grammatical erors. (https://languagetool.org/)
- Parameters:
grammar_error_threshold (int) – the number of additional errors permitted in x_adv relative to x
compare_against_original (bool) – If True, compare against the original text. Otherwise, compare against the most recent text.
language – language to use for languagetool (available choices: https://dev.languagetool.org/languages)
Part of Speech Constraint
- class textattack.constraints.grammaticality.part_of_speech.PartOfSpeech(tagger_type='nltk', tagset='universal', allow_verb_noun_swap=True, compare_against_original=True, language_nltk='eng', language_stanza='en')[source]
Bases:
Constraint
Constraints word swaps to only swap words with the same part of speech. Uses the NLTK universal part-of-speech tagger by default. An implementation of https://arxiv.org/abs/1907.11932 adapted from https://github.com/jind11/TextFooler.
POS taggers from Flair https://github.com/flairNLP/flair and Stanza https://github.com/stanfordnlp/stanza are also available
- Parameters:
tagger_type (str) – Name of the tagger to use (available choices: “nltk”, “flair”, “stanza”).
tagset (str) – tagset to use for POS tagging (e.g. “universal”)
allow_verb_noun_swap (bool) – If True, allow verbs to be swapped with nouns and vice versa.
compare_against_original (bool) – If True, compare against the original text. Otherwise, compare against the most recent text.
language_nltk – Language to be used for nltk POS-Tagger (available choices: “eng”, “rus”)
language_stanza – Language to be used for stanza POS-Tagger (available choices: https://stanfordnlp.github.io/stanza/available_models.html)
- check_compatibility(transformation)[source]
Checks if this constraint is compatible with the given transformation. For example, the
WordEmbeddingDistance
constraint compares the embedding of the word inserted with that of the word deleted. Therefore it can only be applied in the case of word swaps, and not for transformations which involve only one of insertion or deletion.- Parameters:
transformation – The
Transformation
to check compatibility with.