textattack.shared.utils package

class textattack.shared.utils.importing.LazyLoader(local_name, parent_module_globals, name)[source]

Bases: module

Lazily import a module, mainly to avoid pulling in large dependencies.

This allows them to only be loaded when they are used.

textattack.shared.utils.importing.load_module_from_file(file_path)[source]

Uses importlib to dynamically open a file and load an object from it.

textattack.shared.utils.install.download_from_s3(folder_name, skip_if_cached=True)[source]

Folder name will be saved as <cache_dir>/textattack/<folder_name>. If it doesn’t exist on disk, the zip file will be downloaded and extracted.

Parameters
  • folder_name (str) – path to folder or file in cache

  • skip_if_cached (bool) – If True, skip downloading if content is already cached.

Returns

path to the downloaded folder or file on disk

Return type

str

textattack.shared.utils.install.download_from_url(url, save_path, skip_if_cached=True)[source]

Downloaded file will be saved under <cache_dir>/textattack/<save_path>. If it doesn’t exist on disk, the zip file will be downloaded and extracted.

Parameters
  • url (str) – URL path from which to download.

  • save_path (str) – path to which to save the downloaded content.

  • skip_if_cached (bool) – If True, skip downloading if content is already cached.

Returns

path to the downloaded folder or file on disk

Return type

str

textattack.shared.utils.install.http_get(url, out_file, proxies=None)[source]

Get contents of a URL and save to a file.

https://github.com/huggingface/transformers/blob/master/src/transformers/file_utils.py

textattack.shared.utils.install.path_in_cache(file_path)[source]
textattack.shared.utils.install.s3_url(uri)[source]
textattack.shared.utils.install.set_cache_dir(cache_dir)[source]

Sets all relevant cache directories to TA_CACHE_DIR.

textattack.shared.utils.install.unzip_file(path_to_zip_file, unzipped_folder_path)[source]

Unzips a .zip file to folder path.

textattack.shared.utils.misc.get_textattack_model_num_labels(model_name, model_path)[source]

Reads train_args.json and gets the number of labels for a trained model, if present.

textattack.shared.utils.misc.hashable(key)[source]
textattack.shared.utils.misc.html_style_from_dict(style_dict)[source]

Turns.

{ ‘color’: ‘red’, ‘height’: ‘100px’}

into

style: “color: red; height: 100px”

textattack.shared.utils.misc.html_table_from_rows(rows, title=None, header=None, style_dict=None)[source]
textattack.shared.utils.misc.load_textattack_model_from_path(model_name, model_path)[source]

Loads a pre-trained TextAttack model from its name and path.

For example, model_name “lstm-yelp” and model path “models/classification/lstm/yelp”.

textattack.shared.utils.misc.set_seed(random_seed)[source]
textattack.shared.utils.misc.sigmoid(n)[source]
class textattack.shared.utils.strings.ANSI_ESCAPE_CODES[source]

Bases: object

Escape codes for printing color to the terminal.

BOLD = '\x1b[1m'
FAIL = '\x1b[91m'
GRAY = '\x1b[37m'
HEADER = '\x1b[95m'
OKBLUE = '\x1b[94m'
OKGREEN = '\x1b[92m'
PURPLE = '\x1b[35m'
STOP = '\x1b[0m'
UNDERLINE = '\x1b[4m'

This color stops the current color sequence.

WARNING = '\x1b[93m'
textattack.shared.utils.strings.add_indent(s_, numSpaces)[source]
textattack.shared.utils.strings.check_if_punctuations(word)[source]

Returns True if word is just a sequence of punctuations.

textattack.shared.utils.strings.check_if_subword(token, model_type, starting=False)[source]

Check if token is a subword token that is not a standalone word.

Parameters
  • token (str) – token to check.

  • model_type (str) – type of model (options: “bert”, “roberta”, “xlnet”).

  • starting (bool) – Should be set True if this token is the starting token of the overall text. This matters because models like RoBERTa does not add “Ġ” to beginning token.

Returns

True if token is a subword token.

Return type

(bool)

textattack.shared.utils.strings.color_from_label(label_num)[source]

Arbitrary colors for different labels.

textattack.shared.utils.strings.color_from_output(label_name, label)[source]

Returns the correct color for a label name, like ‘positive’, ‘medicine’, or ‘entailment’.

textattack.shared.utils.strings.color_text(text, color=None, method=None)[source]
textattack.shared.utils.strings.default_class_repr(self)[source]
textattack.shared.utils.strings.flair_tag(sentence, tag_type='upos-fast')[source]

Tags a Sentence object using flair part-of-speech tagger.

textattack.shared.utils.strings.has_letter(word)[source]

Returns true if word contains at least one character in [A-Za-z].

textattack.shared.utils.strings.is_one_word(word)[source]
textattack.shared.utils.strings.process_label_name(label_name)[source]

Takes a label name from a dataset and makes it nice.

Meant to correct different abbreviations and automatically capitalize.

textattack.shared.utils.strings.strip_BPE_artifacts(token, model_type)[source]

Strip characters such as “Ġ” that are left over from BPE tokenization.

Parameters
  • token (str) –

  • model_type (str) – type of model (options: “bert”, “roberta”, “xlnet”)

textattack.shared.utils.strings.words_from_text(s, words_to_ignore=[])[source]
textattack.shared.utils.strings.zip_flair_result(pred, tag_type='upos-fast')[source]

Takes a sentence tagging from flair and returns two lists, of words and their corresponding parts-of-speech.

textattack.shared.utils.strings.zip_stanza_result(pred, tagset='universal')[source]

Takes the first sentence from a document from stanza and returns two lists, one of words and the other of their corresponding parts-of- speech.

textattack.shared.utils.tensor.batch_model_predict(model_predict, inputs, batch_size=32)[source]

Runs prediction on iterable inputs using batch size batch_size.

Aggregates all predictions into an np.ndarray.