textattack.goal_functions.classification package

Goal fucntion for Classification

Submodules

Determine for if an attack has been successful in Classification

class textattack.goal_functions.classification.classification_goal_function.ClassificationGoalFunction(model_wrapper, maximizable=False, use_cache=True, query_budget=inf, model_batch_size=32, model_cache_size=1048576, allow_skip=True)[source]

Bases: GoalFunction

A goal function defined on a model that outputs a probability for some number of classes.

extra_repr_keys()[source]: Extra fields to be included in the representation of a class.

Determine if an attack has been successful in Hard Label Classficiation.

class textattack.goal_functions.classification.hardlabel_classification.HardLabelClassification(*args, target_max_score=None, **kwargs)[source]

Bases: ClassificationGoalFunction

An hard label attack on classification models which attempts to maximize the semantic similarity of the label such that the target is outside of the decision boundary.

Parameters:: target_max_score (float) – If set, goal is to reduce model output to below this score. Otherwise, goal is to change the overall predicted class.

Determine if maintaining the same predicted label (input reduction)

class textattack.goal_functions.classification.input_reduction.InputReduction(*args, target_num_words=1, **kwargs)[source]

Bases: ClassificationGoalFunction

Attempts to reduce the input down to as few words as possible while maintaining the same predicted label.

From Feng, Wallace, Grissom, Iyyer, Rodriguez, Boyd-Graber. (2018). Pathologies of Neural Models Make Interpretations Difficult. https://arxiv.org/abs/1804.07781

extra_repr_keys()[source]: Extra fields to be included in the representation of a class.

Determine if an attack has been successful in targeted Classification

class textattack.goal_functions.classification.targeted_classification.TargetedClassification(*args, target_class=0, **kwargs)[source]

Bases: ClassificationGoalFunction

A targeted attack on classification models which attempts to maximize the score of the target label.

Complete when the arget label is the predicted label.

extra_repr_keys()[source]: Extra fields to be included in the representation of a class.

Determine successful in untargeted Classification

class textattack.goal_functions.classification.untargeted_classification.UntargetedClassification(*args, target_max_score=None, **kwargs)[source]

Bases: ClassificationGoalFunction

An untargeted attack on classification models which attempts to minimize the score of the correct label until it is no longer the predicted label.

Parameters:: target_max_score (float) – If set, goal is to reduce model output to below this score. Otherwise, goal is to change the overall predicted class.