textattack.shared.utils package
- class textattack.shared.utils.importing.LazyLoader(local_name, parent_module_globals, name)[source]
Bases:
module
Lazily import a module, mainly to avoid pulling in large dependencies.
This allows them to only be loaded when they are used.
- textattack.shared.utils.importing.load_module_from_file(file_path)[source]
Uses
importlib
to dynamically open a file and load an object from it.
- textattack.shared.utils.install.download_from_s3(folder_name, skip_if_cached=True)[source]
Folder name will be saved as <cache_dir>/textattack/<folder_name>. If it doesn’t exist on disk, the zip file will be downloaded and extracted.
- Parameters:
folder_name (str) – path to folder or file in cache
skip_if_cached (bool) – If True, skip downloading if content is already cached.
- Returns:
path to the downloaded folder or file on disk
- Return type:
str
- textattack.shared.utils.install.download_from_url(url, save_path, skip_if_cached=True)[source]
Downloaded file will be saved under <cache_dir>/textattack/<save_path>. If it doesn’t exist on disk, the zip file will be downloaded and extracted.
- Parameters:
url (str) – URL path from which to download.
save_path (str) – path to which to save the downloaded content.
skip_if_cached (bool) – If True, skip downloading if content is already cached.
- Returns:
path to the downloaded folder or file on disk
- Return type:
str
- textattack.shared.utils.install.http_get(url, out_file, proxies=None)[source]
Get contents of a URL and save to a file.
https://github.com/huggingface/transformers/blob/master/src/transformers/file_utils.py
- textattack.shared.utils.install.set_cache_dir(cache_dir)[source]
Sets all relevant cache directories to
TA_CACHE_DIR
.
- textattack.shared.utils.install.unzip_file(path_to_zip_file, unzipped_folder_path)[source]
Unzips a .zip file to folder path.
- textattack.shared.utils.misc.get_textattack_model_num_labels(model_name, model_path)[source]
Reads train_args.json and gets the number of labels for a trained model, if present.
- textattack.shared.utils.misc.html_style_from_dict(style_dict)[source]
Turns.
{ ‘color’: ‘red’, ‘height’: ‘100px’}
- into
style: “color: red; height: 100px”
- textattack.shared.utils.misc.html_table_from_rows(rows, title=None, header=None, style_dict=None)[source]
- textattack.shared.utils.misc.load_textattack_model_from_path(model_name, model_path)[source]
Loads a pre-trained TextAttack model from its name and path.
For example, model_name “lstm-yelp” and model path “models/classification/lstm/yelp”.
- class textattack.shared.utils.strings.ANSI_ESCAPE_CODES[source]
Bases:
object
Escape codes for printing color to the terminal.
- BOLD = '\x1b[1m'
- BROWN = '\x1b[38:5:52m'
- CYAN = '\x1b[96m'
- FAIL = '\x1b[91m'
- GRAY = '\x1b[38:5:240m'
- HEADER = '\x1b[95m'
- OKBLUE = '\x1b[94m'
- OKGREEN = '\x1b[92m'
- ORANGE = '\x1b[38:5:208m'
- PINK = '\x1b[95m'
- PURPLE = '\x1b[35m'
- STOP = '\x1b[0m'
- UNDERLINE = '\x1b[4m'
This color stops the current color sequence.
- WARNING = '\x1b[93m'
- YELLOW = '\x1b[93m'
- class textattack.shared.utils.strings.ReprMixin[source]
Bases:
object
Mixin for enhanced __repr__ and __str__.
- textattack.shared.utils.strings.check_if_punctuations(word)[source]
Returns
True
ifword
is just a sequence of punctuations.
- textattack.shared.utils.strings.check_if_subword(token, model_type, starting=False)[source]
Check if
token
is a subword token that is not a standalone word.- Parameters:
token (str) – token to check.
model_type (str) – type of model (options: “bert”, “roberta”, “xlnet”).
starting (bool) – Should be set
True
if this token is the starting token of the overall text. This matters because models like RoBERTa does not add “Ġ” to beginning token.
- Returns:
True
iftoken
is a subword token.- Return type:
(bool)
- textattack.shared.utils.strings.color_from_label(label_num)[source]
Arbitrary colors for different labels.
- textattack.shared.utils.strings.color_from_output(label_name, label)[source]
Returns the correct color for a label name, like ‘positive’, ‘medicine’, or ‘entailment’.
- textattack.shared.utils.strings.flair_tag(sentence, tag_type='upos-fast')[source]
Tags a Sentence object using flair part-of-speech tagger.
- textattack.shared.utils.strings.has_letter(word)[source]
Returns true if word contains at least one character in [A-Za-z].
- textattack.shared.utils.strings.process_label_name(label_name)[source]
Takes a label name from a dataset and makes it nice.
Meant to correct different abbreviations and automatically capitalize.
- textattack.shared.utils.strings.strip_BPE_artifacts(token, model_type)[source]
Strip characters such as “Ġ” that are left over from BPE tokenization.
- Parameters:
token (str) –
model_type (str) – type of model (options: “bert”, “roberta”, “xlnet”)
- textattack.shared.utils.strings.words_from_text(s, words_to_ignore=[])[source]
Lowercases a string, removes all non-alphanumeric characters, and splits into words.
- textattack.shared.utils.strings.zip_flair_result(pred, tag_type='upos-fast')[source]
Takes a sentence tagging from flair and returns two lists, of words and their corresponding parts-of-speech.