textattack.constraints.semantics package

Semantic Constraints

Semantic constraints determine if a transformation is valid based on similarity of the semantics of the orignal input and the transformed input.

BERT Score

BERT Score is introduced in this paper (BERTScore: Evaluating Text Generation with BERT) arxiv link.

BERT Score measures token similarity between two text using contextual embedding.

To decide which two tokens to compare, it greedily chooses the most similar token from one text and matches it to a token in the second text.

class textattack.constraints.semantics.bert_score.BERTScore(min_bert_score, model_name='bert-base-uncased', num_layers=None, score_type='f1', compare_against_original=True)[source]

Bases: Constraint

A constraint on BERT-Score difference.

Parameters:
  • min_bert_score (float) –

  • model_name (str) –

  • num_layers (int) –

  • score_type (str) – -(1) precision : match words from candidate text to reference text -(2) recall : match words from reference text to candidate text -(3) f1: harmonic mean of precision and recall (recommended)

  • compare_against_original (bool) – If True, compare new x_adv against the original x. Otherwise, compare it against the previous x_adv.

extra_repr_keys()[source]

Set the extra representation of the constraint using these keys.

To print customized extra information, you should reimplement this method in your own constraint. Both single-line and multi- line strings are acceptable.

SCORE_TYPE2IDX = {'f1': 2, 'precision': 0, 'recall': 1}

Word Embedding Distance

class textattack.constraints.semantics.word_embedding_distance.WordEmbeddingDistance(embedding=None, include_unknown_words=True, min_cos_sim=None, max_mse_dist=None, cased=False, compare_against_original=True)[source]

Bases: Constraint

A constraint on word substitutions which places a maximum distance between the embedding of the word being deleted and the word being inserted.

Parameters:
  • embedding (obj) – Wrapper for word embedding.

  • include_unknown_words (bool) – Whether or not the constraint is fulfilled if the embedding of x or x_adv is unknown.

  • min_cos_sim (float, optional) – The minimum cosine similarity between word embeddings.

  • max_mse_dist (float, optional) – The maximum euclidean distance between word embeddings.

  • cased (bool) – Whether embedding supports uppercase & lowercase (defaults to False, or just lowercase).

  • compare_against_original (bool) – If True, compare new x_adv against the original x. Otherwise, compare it against the previous x_adv.

check_compatibility(transformation)[source]

WordEmbeddingDistance requires a word being both deleted and inserted at the same index in order to compare their embeddings, therefore it’s restricted to word swaps.

extra_repr_keys()[source]

Set the extra representation of the constraint using these keys.

To print customized extra information, you should reimplement this method in your own constraint. Both single-line and multi- line strings are acceptable.

get_cos_sim(a, b)[source]

Returns the cosine similarity of words with IDs a and b.

get_mse_dist(a, b)[source]

Returns the MSE distance of words with IDs a and b.