Goal Functions API Reference
GoalFunction
determines both the conditions under which the attack is successful (in terms of the model outputs)
and the heuristic score that we want to maximize when searching for the solution.
GoalFunction
- class textattack.goal_functions.GoalFunction(model_wrapper, maximizable=False, use_cache=True, query_budget=inf, model_batch_size=32, model_cache_size=1048576)[source]
Evaluates how well a perturbed attacked_text object is achieving a specified goal.
- Parameters
model_wrapper (
ModelWrapper
) – The victim model to attack.maximizable (
bool
, optional, defaults toFalse
) – Whether the goal function is maximizable, as opposed to a boolean result of success or failure.query_budget (
float
, optional, defaults tofloat("in")
) – The maximum number of model queries allowed.model_cache_size (
int
, optional, defaults to2**20
) – The maximum number of items to keep in the model results cache at once.
- get_output(attacked_text)[source]
Returns output for display based on the result of calling the model.
- get_result(attacked_text, **kwargs)[source]
A helper method that queries
self.get_results
with a singleAttackedText
object.
- get_results(attacked_text_list, check_skip=False)[source]
For each attacked_text object in attacked_text_list, returns a result consisting of whether or not the goal has been achieved, the output for display purposes, and a score.
Additionally returns whether the search is over due to the query budget.
ClassificationGoalFunction
TargetedClassification
UntargetedClassification
- class textattack.goal_functions.classification.UntargetedClassification(*args, target_max_score=None, **kwargs)[source]
An untargeted attack on classification models which attempts to minimize the score of the correct label until it is no longer the predicted label.
- Parameters
target_max_score (float) – If set, goal is to reduce model output to below this score. Otherwise, goal is to change the overall predicted class.
InputReduction
- class textattack.goal_functions.classification.InputReduction(*args, target_num_words=1, **kwargs)[source]
Attempts to reduce the input down to as few words as possible while maintaining the same predicted label.
From Feng, Wallace, Grissom, Iyyer, Rodriguez, Boyd-Graber. (2018). Pathologies of Neural Models Make Interpretations Difficult. https://arxiv.org/abs/1804.07781
TextToTextGoalFunction
- class textattack.goal_functions.text.TextToTextGoalFunction(model_wrapper, maximizable=False, use_cache=True, query_budget=inf, model_batch_size=32, model_cache_size=1048576)[source]
A goal function defined on a model that outputs text.
model: The PyTorch or TensorFlow model used for evaluation. original_output: the original output of the model
MinimizeBleu
- class textattack.goal_functions.text.MinimizeBleu(*args, target_bleu=0.0, **kwargs)[source]
Attempts to minimize the BLEU score between the current output translation and the reference translation.
BLEU score was defined in (BLEU: a Method for Automatic Evaluation of Machine Translation).
This goal function is defined in (It’s Morphin’ Time! Combating Linguistic Discrimination with Inflectional Perturbations).
NonOverlappingOutput
- class textattack.goal_functions.text.NonOverlappingOutput(model_wrapper, maximizable=False, use_cache=True, query_budget=inf, model_batch_size=32, model_cache_size=1048576)[source]
Ensures that none of the words at a position are equal.
Defined in seq2sick (https://arxiv.org/pdf/1803.01128.pdf), equation (3).