textattack.models.helpers package

Moderl Helpers

Glove Embedding

class textattack.models.helpers.glove_embedding_layer.EmbeddingLayer(n_d=100, embedding_matrix=None, word_list=None, oov='<oov>', pad='<pad>', normalize=True)[source]

Bases: torch.nn.modules.module.Module

A layer of a model that replaces word IDs with their embeddings.

This is a useful abstraction for any nn.module which wants to take word IDs (a sequence of text) as input layer but actually manipulate words’ embeddings.

Requires some pre-trained embedding with associated word IDs.

forward(input)[source]

Defines the computation performed at every call.

Should be overridden by all subclasses.

Note

Although the recipe for forward pass needs to be defined within this function, one should call the Module instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.

training: bool
class textattack.models.helpers.glove_embedding_layer.GloveEmbeddingLayer(emb_layer_trainable=True)[source]

Bases: textattack.models.helpers.glove_embedding_layer.EmbeddingLayer

Pre-trained Global Vectors for Word Representation (GLOVE) vectors. Uses embeddings of dimension 200.

GloVe is an unsupervised learning algorithm for obtaining vector representations for words. Training is performed on aggregated global word-word co-occurrence statistics from a corpus, and the resulting representations showcase interesting linear substructures of the word vector space.

GloVe: Global Vectors for Word Representation. (Jeffrey Pennington,

Richard Socher, and Christopher D. Manning. 2014.)

EMBEDDING_PATH = 'word_embeddings/glove200'
training: bool

LSTM 4 Classification

class textattack.models.helpers.lstm_for_classification.LSTMForClassification(hidden_size=150, depth=1, dropout=0.3, num_labels=2, max_seq_length=128, model_path=None, emb_layer_trainable=True)[source]

Bases: torch.nn.modules.module.Module

A long short-term memory neural network for text classification.

We use different versions of this network to pretrain models for text classification.

forward(_input)[source]

Defines the computation performed at every call.

Should be overridden by all subclasses.

Note

Although the recipe for forward pass needs to be defined within this function, one should call the Module instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.

classmethod from_pretrained(name_or_path)[source]

Load trained LSTM model by name or from path.

Parameters

name_or_path (str) – Name of the model (e.g. “lstm-imdb”) or model saved via save_pretrained().

Returns

LSTMForClassification model

get_input_embeddings()[source]
load_from_disk(model_path)[source]
save_pretrained(output_path)[source]
training: bool

T5 model trained to generate text from text

class textattack.models.helpers.t5_for_text_to_text.T5ForTextToText(mode='english_to_german', output_max_length=20, input_max_length=64, num_beams=1, early_stopping=True)[source]

Bases: torch.nn.modules.module.Module

A T5 model trained to generate text from text.

For more information, please see the T5 paper, “Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer”. Appendix D contains information about the various tasks supported by T5.

For usage information, see HuggingFace Transformers documentation section on text-to-text with T5: https://huggingface.co/transformers/usage.html.

Parameters
  • mode (string) – Name of the T5 model to use.

  • output_max_length (int) – The max length of the sequence to be generated. Between 1 and infinity.

  • input_max_length (int) – Max length of the input sequence.

  • num_beams (int) – Number of beams for beam search. Must be between 1 and infinity. 1 means no beam search.

  • early_stopping (bool) – if set to True beam search is stopped when at least num_beams sentences finished per batch. Defaults to True.

classmethod from_pretrained(name_or_path)[source]

Load trained LSTM model by name or from path.

Parameters

name_or_path (str) – Name of the model (e.g. “t5-en-de”) or model saved via save_pretrained.

get_input_embeddings()[source]
save_pretrained(output_dir)[source]
training: bool

Util function for Model Wrapper

textattack.models.helpers.utils.load_cached_state_dict(model_folder_path)[source]

Word CNN for Classification

class textattack.models.helpers.word_cnn_for_classification.CNNTextLayer(n_in, widths=[3, 4, 5], filters=100)[source]

Bases: torch.nn.modules.module.Module

forward(x)[source]

Defines the computation performed at every call.

Should be overridden by all subclasses.

Note

Although the recipe for forward pass needs to be defined within this function, one should call the Module instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.

training: bool
class textattack.models.helpers.word_cnn_for_classification.WordCNNForClassification(hidden_size=150, dropout=0.3, num_labels=2, max_seq_length=128, model_path=None, emb_layer_trainable=True)[source]

Bases: torch.nn.modules.module.Module

A convolutional neural network for text classification.

We use different versions of this network to pretrain models for text classification.

forward(_input)[source]

Defines the computation performed at every call.

Should be overridden by all subclasses.

Note

Although the recipe for forward pass needs to be defined within this function, one should call the Module instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.

classmethod from_pretrained(name_or_path)[source]

Load trained Word CNN model by name or from path.

Parameters

name_or_path (str) – Name of the model (e.g. “cnn-imdb”) or model saved via save_pretrained().

Returns

WordCNNForClassification model

get_input_embeddings()[source]
load_from_disk(model_path)[source]
save_pretrained(output_path)[source]
training: bool