Goal Functions API Reference
GoalFunction determines both the conditions under which the attack is successful (in terms of the model outputs)
and the heuristic score that we want to maximize when searching for the solution.
- class textattack.goal_functions.GoalFunction(model_wrapper, maximizable=False, use_cache=True, query_budget=inf, model_batch_size=32, model_cache_size=1048576)[source]
Evaluates how well a perturbed attacked_text object is achieving a specified goal.
ModelWrapper) – The victim model to attack.
bool, optional, defaults to
False) – Whether the goal function is maximizable, as opposed to a boolean result of success or failure.
float, optional, defaults to
float("in")) – The maximum number of model queries allowed.
int, optional, defaults to
2**20) – The maximum number of items to keep in the model results cache at once.
Returns output for display based on the result of calling the model.
- get_result(attacked_text, **kwargs)[source]
A helper method that queries
self.get_resultswith a single
- get_results(attacked_text_list, check_skip=False)[source]
For each attacked_text object in attacked_text_list, returns a result consisting of whether or not the goal has been achieved, the output for display purposes, and a score.
Additionally returns whether the search is over due to the query budget.
- class textattack.goal_functions.classification.ClassificationGoalFunction(model_wrapper, maximizable=False, use_cache=True, query_budget=inf, model_batch_size=32, model_cache_size=1048576)[source]
A goal function defined on a model that outputs a probability for some number of classes.
- class textattack.goal_functions.classification.TargetedClassification(*args, target_class=0, **kwargs)[source]
A targeted attack on classification models which attempts to maximize the score of the target label.
Complete when the arget label is the predicted label.
- class textattack.goal_functions.classification.UntargetedClassification(*args, target_max_score=None, **kwargs)[source]
An untargeted attack on classification models which attempts to minimize the score of the correct label until it is no longer the predicted label.
target_max_score (float) – If set, goal is to reduce model output to below this score. Otherwise, goal is to change the overall predicted class.
- class textattack.goal_functions.classification.InputReduction(*args, target_num_words=1, **kwargs)[source]
Attempts to reduce the input down to as few words as possible while maintaining the same predicted label.
From Feng, Wallace, Grissom, Iyyer, Rodriguez, Boyd-Graber. (2018). Pathologies of Neural Models Make Interpretations Difficult. https://arxiv.org/abs/1804.07781
- class textattack.goal_functions.text.TextToTextGoalFunction(model_wrapper, maximizable=False, use_cache=True, query_budget=inf, model_batch_size=32, model_cache_size=1048576)[source]
A goal function defined on a model that outputs text.
model: The PyTorch or TensorFlow model used for evaluation. original_output: the original output of the model
- class textattack.goal_functions.text.MinimizeBleu(*args, target_bleu=0.0, **kwargs)[source]
Attempts to minimize the BLEU score between the current output translation and the reference translation.
BLEU score was defined in (BLEU: a Method for Automatic Evaluation of Machine Translation).
This goal function is defined in (It’s Morphin’ Time! Combating Linguistic Discrimination with Inflectional Perturbations).
- class textattack.goal_functions.text.NonOverlappingOutput(model_wrapper, maximizable=False, use_cache=True, query_budget=inf, model_batch_size=32, model_cache_size=1048576)[source]
Ensures that none of the words at a position are equal.
Defined in seq2sick (https://arxiv.org/pdf/1803.01128.pdf), equation (3).