The importance of constraints

Constraints determine which potential adversarial examples are valid inputs to the model. When determining the efficacy of an attack, constraints are everything. After all, an attack that looks very powerful may just be generating nonsense. Or, perhaps more nefariously, an attack may generate a real-looking example that changes the original label of the input. That’s why you should always clearly define the constraints your adversarial examples must meet.

Open In Colab

View Source on GitHub

Please remember to run pip3 install textattack[tensorflow] in your notebook enviroment before the following codes:

Classes of constraints

TextAttack evaluates constraints using methods from three groups:

  • Overlap constraints determine if a perturbation is valid based on character-level analysis. For example, some attacks are constrained by edit distance: a perturbation is only valid if it perturbs some small number of characters (or fewer).

  • Grammaticality constraints filter inputs based on syntactical information. For example, an attack may require that adversarial perturbations do not introduce grammatical errors.

  • Semantic constraints try to ensure that the perturbation is semantically similar to the original input. For example, we may design a constraint that uses a sentence encoder to encode the original and perturbed inputs, and enforce that the sentence encodings be within some fixed distance of one another. (This is what happens in subclasses of textattack.constraints.semantics.sentence_encoders.)

A new constraint

To add our own constraint, we need to create a subclass of textattack.constraints.Constraint. We can implement one of two functions, either _check_constraint or _check_constraint_many:

  • _check_constraint determines whether candidate AttackedText transformed_text, transformed from current_text, fulfills a desired constraint. It returns either True or False.

  • _check_constraint_many determines whether each of a list of candidates transformed_texts fulfill the constraint relative to current_text. This is here in case your constraint can be vectorized. If not, just implement _check_constraint, and _check_constraint will be executed for each (transformed_text, current_text) pair.

A custom constraint

For fun, we’re going to see what happens when we constrain an attack to only allow perturbations that substitute out a named entity for another. In linguistics, a named entity is a proper noun, the name of a person, organization, location, product, etc. Named Entity Recognition is a popular NLP task (and one that state-of-the-art models can perform quite well).

NLTK and Named Entity Recognition

NLTK, the Natural Language Toolkit, is a Python package that helps developers write programs that process natural language. NLTK comes with predefined algorithms for lots of linguistic tasks– including Named Entity Recognition.

First, we’re going to write a constraint class. In the _check_constraints method, we’re going to use NLTK to find the named entities in both current_text and transformed_text. We will only return True (that is, our constraint is met) if transformed_text has substituted one named entity in current_text for another.

Let’s import NLTK and download the required modules:

[1]:
import nltk
nltk.download('punkt') # The NLTK tokenizer
nltk.download('maxent_ne_chunker') # NLTK named-entity chunker
nltk.download('words') # NLTK list of words
nltk.download('averaged_perceptron_tagger')
[nltk_data] Downloading package punkt to /u/lab/jy2ma/nltk_data...
[nltk_data]   Package punkt is already up-to-date!
[nltk_data] Downloading package maxent_ne_chunker to
[nltk_data]     /u/lab/jy2ma/nltk_data...
[nltk_data]   Package maxent_ne_chunker is already up-to-date!
[nltk_data] Downloading package words to /u/lab/jy2ma/nltk_data...
[nltk_data]   Package words is already up-to-date!
[nltk_data] Downloading package averaged_perceptron_tagger to
[nltk_data]     /u/lab/jy2ma/nltk_data...
[nltk_data]   Package averaged_perceptron_tagger is already up-to-
[nltk_data]       date!
[1]:
True

NLTK NER Example

Here’s an example of using NLTK to find the named entities in a sentence:

[2]:
sentence = ('In 2017, star quarterback Tom Brady led the Patriots to the Super Bowl, '
           'but lost to the Philadelphia Eagles.')

# 1. Tokenize using the NLTK tokenizer.
tokens = nltk.word_tokenize(sentence)

# 2. Tag parts of speech using the NLTK part-of-speech tagger.
tagged = nltk.pos_tag(tokens)

# 3. Extract entities from tagged sentence.
entities = nltk.chunk.ne_chunk(tagged)
print(entities)
(S
  In/IN
  2017/CD
  ,/,
  star/NN
  quarterback/NN
  (PERSON Tom/NNP Brady/NNP)
  led/VBD
  the/DT
  (ORGANIZATION Patriots/NNP)
  to/TO
  the/DT
  (ORGANIZATION Super/NNP Bowl/NNP)
  ,/,
  but/CC
  lost/VBD
  to/TO
  the/DT
  (ORGANIZATION Philadelphia/NNP Eagles/NNP)
  ./.)

It looks like nltk.chunk.ne_chunk gives us an nltk.tree.Tree object where named entities are also nltk.tree.Tree objects within that tree. We can take this a step further and grab the named entities from the tree of entities:

[3]:
# 4. Filter entities to just named entities.
named_entities = [entity for entity in entities if isinstance(entity, nltk.tree.Tree)]
print(named_entities)
[Tree('PERSON', [('Tom', 'NNP'), ('Brady', 'NNP')]), Tree('ORGANIZATION', [('Patriots', 'NNP')]), Tree('ORGANIZATION', [('Super', 'NNP'), ('Bowl', 'NNP')]), Tree('ORGANIZATION', [('Philadelphia', 'NNP'), ('Eagles', 'NNP')])]

Caching with @functools.lru_cache

A little-known feature of Python 3 is functools.lru_cache, a decorator that allows users to easily cache the results of a function in an LRU cache. We’re going to be using the NLTK library quite a bit to tokenize, parse, and detect named entities in sentences. These sentences might repeat themselves. As such, we’ll use this decorator to cache named entities so that we don’t have to perform this expensive computation multiple times.

Putting it all together: getting a list of Named Entity Labels from a sentence

Now that we know how to tokenize, parse, and detect named entities using NLTK, let’s put it all together into a single helper function. Later, when we implement our constraint, we can query this function to easily get the entity labels from a sentence. We can even use @functools.lru_cache to try and speed this process up.

[4]:
import functools

@functools.lru_cache(maxsize=2**14)
def get_entities(sentence):
    tokens = nltk.word_tokenize(sentence)
    tagged = nltk.pos_tag(tokens)
    # Setting `binary=True` makes NLTK return all of the named
    # entities tagged as NNP instead of detailed tags like
    #'Organization', 'Geo-Political Entity', etc.
    entities = nltk.chunk.ne_chunk(tagged, binary=True)
    return entities.leaves()

And let’s test our function to make sure it works:

[5]:
sentence = 'Jack Black starred in the 2003 film classic "School of Rock".'
get_entities(sentence)
[5]:
[('Jack', 'NNP'),
 ('Black', 'NNP'),
 ('starred', 'VBD'),
 ('in', 'IN'),
 ('the', 'DT'),
 ('2003', 'CD'),
 ('film', 'NN'),
 ('classic', 'JJ'),
 ('``', '``'),
 ('School', 'NNP'),
 ('of', 'IN'),
 ('Rock', 'NNP'),
 ("''", "''"),
 ('.', '.')]

We flattened the tree of entities, so the return format is a list of (word, entity type) tuples. For non-entities, the entity_type is just the part of speech of the word. 'NNP' is the indicator of a named entity (a proper noun, according to NLTK). Looks like we identified three named entities here: ‘Jack’ and ‘Black’, ‘School’, and ‘Rock’. as a ‘GPE’. (Seems that the labeler thinks Rock is the name of a place, a city or something.) Whatever technique NLTK uses for named entity recognition may be a bit rough, but it did a pretty decent job here!

Creating our NamedEntityConstraint

Now that we know how to detect named entities using NLTK, let’s create our custom constraint.

[6]:
from textattack.constraints import Constraint

class NamedEntityConstraint(Constraint):
    """ A constraint that ensures `transformed_text` only substitutes named entities from `current_text` with other named entities.
    """
    def _check_constraint(self, transformed_text, current_text):
        transformed_entities = get_entities(transformed_text.text)
        current_entities = get_entities(current_text.text)
        # If there aren't named entities, let's return False (the attack
        # will eventually fail).
        if len(current_entities) == 0:
            return False
        if len(current_entities) != len(transformed_entities):
            # If the two sentences have a different number of entities, then
            # they definitely don't have the same labels. In this case, the
            # constraint is violated, and we return False.
            return False
        else:
            # Here we compare all of the words, in order, to make sure that they match.
            # If we find two words that don't match, this means a word was swapped
            # between `current_text` and `transformed_text`. That word must be a named entity to fulfill our
            # constraint.
            current_word_label = None
            transformed_word_label = None
            for (word_1, label_1), (word_2, label_2) in zip(current_entities, transformed_entities):
                if word_1 != word_2:
                    # Finally, make sure that words swapped between `x` and `x_adv` are named entities. If
                    # they're not, then we also return False.
                    if (label_1 not in ['NNP', 'NE']) or (label_2 not in ['NNP', 'NE']):
                        return False
            # If we get here, all of the labels match up. Return True!
            return True

Testing our constraint

We need to create an attack and a dataset to test our constraint on. We went over all of this in the transformations tutorial, so let’s gloss over this part for now.

[15]:
# Import the model
import transformers
from textattack.models.wrappers import HuggingFaceModelWrapper

model = transformers.AutoModelForSequenceClassification.from_pretrained("textattack/albert-base-v2-ag-news")
tokenizer = transformers.AutoTokenizer.from_pretrained("textattack/albert-base-v2-ag-news")

model_wrapper = HuggingFaceModelWrapper(model, tokenizer)

# Create the goal function using the model
from textattack.goal_functions import UntargetedClassification
goal_function = UntargetedClassification(model_wrapper)

# Import the dataset
from textattack.datasets import HuggingFaceDataset
dataset = HuggingFaceDataset("ag_news", None, "test")





textattack: Unknown if model of class <class 'transformers.models.albert.modeling_albert.AlbertForSequenceClassification'> compatible with goal function <class 'textattack.goal_functions.classification.untargeted_classification.UntargetedClassification'>.
Using custom data configuration default
Reusing dataset ag_news (/p/qdata/jy2ma/.cache/textattack/datasets/ag_news/default/0.0.0/0eeeaaa5fb6dffd81458e293dfea1adba2881ffcbdc3fb56baeb5a892566c29a)
textattack: Loading datasets dataset ag_news, split test.
[16]:
from textattack.transformations import WordSwapEmbedding
from textattack.search_methods import GreedyWordSwapWIR
from textattack import Attack
from textattack.constraints.pre_transformation import RepeatModification, StopwordModification

# We're going to the `WordSwapEmbedding` transformation. Using the default settings, this
# will try substituting words with their neighbors in the counter-fitted embedding space.
transformation = WordSwapEmbedding(max_candidates=20)

# We'll use the greedy search with word importance ranking method again
search_method = GreedyWordSwapWIR()

# Our constraints will be the same as Tutorial 1, plus the named entity constraint
constraints = [RepeatModification(),
               StopwordModification(),
               NamedEntityConstraint(False)]

# Now, let's make the attack using these parameters.
attack = Attack(goal_function, constraints, transformation, search_method)


Now, let’s use our attack. We’re going to attack samples until we achieve 5 successes. (There’s a lot to check here, and since we’re using a greedy search over all potential word swap positions, each sample will take a few minutes. This will take a few hours to run on a single core.)

[17]:
from textattack.loggers import CSVLogger # tracks a dataframe for us.
from textattack.attack_results import SuccessfulAttackResult
from textattack import Attacker, AttackArgs

attack_args = AttackArgs(num_successful_examples=5, log_to_csv="results.csv", csv_coloring_style="html")
attacker = Attacker(attack, dataset, attack_args)

attacker.attack_dataset()
textattack: Logging to CSV at path results.csv


  0%|          | 0/5 [00:00<?, ?it/s]
Attack(
  (search_method): GreedyWordSwapWIR(
    (wir_method):  unk
  )
  (goal_function):  UntargetedClassification
  (transformation):  WordSwapEmbedding(
    (max_candidates):  20
    (embedding):  WordEmbedding
  )
  (constraints):
    (0): NamedEntityConstraint(
        (compare_against_original):  False
      )
    (1): RepeatModification
    (2): StopwordModification
  (is_black_box):  True
)



 20%|██        | 1/5 [00:01<00:06,  1.57s/it]

[Succeeded / Failed / Skipped / Total] 1 / 0 / 0 / 1:  20%|██        | 1/5 [00:01<00:06,  1.58s/it]
--------------------------------------------- Result 1 ---------------------------------------------
Business (75%) --> Sci/tech (61%)

Fears for T N pension after talks Unions representing workers at Turner   Newall say they are 'disappointed' after talks with stricken parent firm Federal Mogul.

Fears for T N pension after talks Unions representing workers at Knapp   Newall say they are 'disappointed' after talks with stricken parent firm Federal Titan.




[Succeeded / Failed / Skipped / Total] 1 / 1 / 0 / 2:  20%|██        | 1/5 [00:12<00:48, 12.17s/it]
--------------------------------------------- Result 2 ---------------------------------------------
Sci/tech (100%) --> [FAILED]

The Race is On: Second Private Team Sets Launch Date for Human Spaceflight (SPACE.com) SPACE.com - TORONTO, Canada -- A second\team of rocketeers competing for the  #36;10 million Ansari X Prize, a contest for\privately funded suborbital space flight, has officially announced the first\launch date for its manned rocket.




[Succeeded / Failed / Skipped / Total] 1 / 2 / 0 / 3:  20%|██        | 1/5 [00:18<01:14, 18.64s/it]
--------------------------------------------- Result 3 ---------------------------------------------
Sci/tech (100%) --> [FAILED]

Ky. Company Wins Grant to Study Peptides (AP) AP - A company founded by a chemistry researcher at the University of Louisville won a grant to develop a method of producing better peptides, which are short chains of amino acids, the building blocks of proteins.




[Succeeded / Failed / Skipped / Total] 1 / 2 / 0 / 3:  40%|████      | 2/5 [00:21<00:32, 10.80s/it]

[Succeeded / Failed / Skipped / Total] 2 / 2 / 0 / 4:  40%|████      | 2/5 [00:21<00:32, 10.81s/it]
--------------------------------------------- Result 4 ---------------------------------------------
Sci/tech (100%) --> Sports (69%)

Prediction Unit Helps Forecast Wildfires (AP) AP - It's barely dawn when Mike Fitzpatrick starts his shift with a blur of colorful maps, figures and endless charts, but already he knows what the day will bring. Lightning will strike in places he expects. Winds will pick up, moist places will dry and flames will roar.

Foresight Driving Helps Expectations Wildfires (AP) AP - It's barely dawn when Meek Fitzpatrick starts his shift with a blur of colorful maps, figures and endless charts, but already he knows what the day will bring. Lightning will strike in places he expects. Winds will pick up, moist places will dry and flames will roar.




[Succeeded / Failed / Skipped / Total] 2 / 3 / 0 / 5:  40%|████      | 2/5 [00:26<00:39, 13.11s/it]
--------------------------------------------- Result 5 ---------------------------------------------
Sci/tech (100%) --> [FAILED]

Calif. Aims to Limit Farm-Related Smog (AP) AP - Southern California's smog-fighting agency went after emissions of the bovine variety Friday, adopting the nation's first rules to reduce air pollution from dairy cow manure.




[Succeeded / Failed / Skipped / Total] 2 / 4 / 0 / 6:  40%|████      | 2/5 [00:55<01:23, 27.95s/it]
--------------------------------------------- Result 6 ---------------------------------------------
Sci/tech (100%) --> [FAILED]

Open Letter Against British Copyright Indoctrination in Schools The British Department for Education and Skills (DfES) recently launched a "Music Manifesto" campaign, with the ostensible intention of educating the next generation of British musicians. Unfortunately, they also teamed up with the music industry (EMI, and various artists) to make this popular. EMI has apparently negotiated their end well, so that children in our schools will now be indoctrinated about the illegality of downloading music.The ignorance and audacity of this got to me a little, so I wrote an open letter to the DfES about it. Unfortunately, it's pedantic, as I suppose you have to be when writing to goverment representatives. But I hope you find it useful, and perhaps feel inspired to do something similar, if or when the same thing has happened in your area.




[Succeeded / Failed / Skipped / Total] 2 / 5 / 0 / 7:  40%|████      | 2/5 [01:15<01:53, 37.90s/it]
--------------------------------------------- Result 7 ---------------------------------------------
Sci/tech (100%) --> [FAILED]

Loosing the War on Terrorism \\"Sven Jaschan, self-confessed author of the Netsky and Sasser viruses, is\responsible for 70 percent of virus infections in 2004, according to a six-month\virus roundup published Wednesday by antivirus company Sophos."\\"The 18-year-old Jaschan was taken into custody in Germany in May by police who\said he had admitted programming both the Netsky and Sasser worms, something\experts at Microsoft confirmed. (A Microsoft antivirus reward program led to the\teenager's arrest.) During the five months preceding Jaschan's capture, there\were at least 25 variants of Netsky and one of the port-scanning network worm\Sasser."\\"Graham Cluley, senior technology consultant at Sophos, said it was staggeri ...\\




[Succeeded / Failed / Skipped / Total] 2 / 6 / 0 / 8:  40%|████      | 2/5 [01:38<02:27, 49.02s/it]
--------------------------------------------- Result 8 ---------------------------------------------
Sci/tech (100%) --> [FAILED]

FOAFKey: FOAF, PGP, Key Distribution, and Bloom Filters \\FOAF/LOAF  and bloom filters have a lot of interesting properties for social\network and whitelist distribution.\\I think we can go one level higher though and include GPG/OpenPGP key\fingerpring distribution in the FOAF file for simple web-of-trust based key\distribution.\\What if we used FOAF and included the PGP key fingerprint(s) for identities?\This could mean a lot.  You include the PGP key fingerprints within the FOAF\file of your direct friends and then include a bloom filter of the PGP key\fingerprints of your entire whitelist (the source FOAF file would of course need\to be encrypted ).\\Your whitelist would be populated from the social network as your client\discovered new identit ...\\




[Succeeded / Failed / Skipped / Total] 2 / 6 / 0 / 8:  60%|██████    | 3/5 [01:39<01:06, 33.04s/it]

[Succeeded / Failed / Skipped / Total] 3 / 6 / 0 / 9:  60%|██████    | 3/5 [01:39<01:06, 33.05s/it]

[Succeeded / Failed / Skipped / Total] 3 / 6 / 1 / 10:  60%|██████    | 3/5 [01:39<01:06, 33.06s/it]
--------------------------------------------- Result 9 ---------------------------------------------
Sci/tech (73%) --> World (69%)

E-mail scam targets police chief Wiltshire Police warns about "phishing" after its fraud squad chief was targeted.

E-mail scam targets police chief Wiltshire Constabulary warns about "phishing" after its fraud squad chief was targeted.


--------------------------------------------- Result 10 ---------------------------------------------
Business (81%) --> [SKIPPED]

Card fraud unit nets 36,000 cards In its first two years, the UK's dedicated card fraud unit, has recovered 36,000 stolen cards and 171 arrests - and estimates it saved 65m.




[Succeeded / Failed / Skipped / Total] 3 / 7 / 1 / 11:  60%|██████    | 3/5 [01:50<01:13, 36.68s/it]
--------------------------------------------- Result 11 ---------------------------------------------
Sci/tech (100%) --> [FAILED]

Group to Propose New High-Speed Wireless Format  LOS ANGELES (Reuters) - A group of technology companies  including Texas Instruments Inc. &lt;TXN.N&gt;, STMicroelectronics  &lt;STM.PA&gt; and Broadcom Corp. &lt;BRCM.O&gt;, on Thursday said they  will propose a new wireless networking standard up to 10 times  the speed of the current generation.




[Succeeded / Failed / Skipped / Total] 3 / 7 / 1 / 11:  80%|████████  | 4/5 [01:59<00:29, 29.79s/it]

[Succeeded / Failed / Skipped / Total] 4 / 7 / 1 / 12:  80%|████████  | 4/5 [01:59<00:29, 29.80s/it]
--------------------------------------------- Result 12 ---------------------------------------------
Sci/tech (99%) --> Business (52%)

Apple Launches Graphics Software, Video Bundle  LOS ANGELES (Reuters) - Apple Computer Inc.&lt;AAPL.O&gt; on  Tuesday began shipping a new program designed to let users  create real-time motion graphics and unveiled a discount  video-editing software bundle featuring its flagship Final Cut  Pro software.

Apple Startup Charting Software, Film Pooling  LOS FRESNO (Msnbc) - Apple Team Inc.&lt;AAPL.s&gt; on  Friday began shipping a new program designed to let users  create real-time motion graphics and unveiled a discount  video-editing software bundle featuring its flagship Conclude Cuts  Careers software.




[Succeeded / Failed / Skipped / Total] 4 / 7 / 1 / 12: 100%|██████████| 5/5 [02:04<00:00, 24.83s/it]

[Succeeded / Failed / Skipped / Total] 5 / 7 / 1 / 13: 100%|██████████| 5/5 [02:04<00:00, 24.83s/it]
--------------------------------------------- Result 13 ---------------------------------------------
Sci/tech (98%) --> Business (72%)

Dutch Retailer Beats Apple to Local Download Market  AMSTERDAM (Reuters) - Free Record Shop, a Dutch music  retail chain, beat Apple Computer Inc. to market on Tuesday  with the launch of a new download service in Europe's latest  battleground for digital song services.

Dutch Retailer Beats Abel to Local Absolution Market  AMSTERDAM (Reuters) - Free Registering Depot, a Dutch music  retail chain, beat Cobbler Typewriters Inc. to market on Tuesday  with the launch of a new download service in Europe's latest  battleground for digital song services.



+-------------------------------+--------+
| Attack Results                |        |
+-------------------------------+--------+
| Number of successful attacks: | 5      |
| Number of failed attacks:     | 7      |
| Number of skipped attacks:    | 1      |
| Original accuracy:            | 92.31% |
| Accuracy under attack:        | 53.85% |
| Attack success rate:          | 41.67% |
| Average perturbed word %:     | 12.45% |
| Average num. words per input: | 59.46  |
| Avg num queries:              | 206.83 |
+-------------------------------+--------+

[17]:
[<textattack.attack_results.successful_attack_result.SuccessfulAttackResult at 0x7f3bf14ceeb0>,
 <textattack.attack_results.failed_attack_result.FailedAttackResult at 0x7f3beab4abb0>,
 <textattack.attack_results.failed_attack_result.FailedAttackResult at 0x7f3d0d2b26a0>,
 <textattack.attack_results.successful_attack_result.SuccessfulAttackResult at 0x7f3bca8ca370>,
 <textattack.attack_results.failed_attack_result.FailedAttackResult at 0x7f3bfc3de6d0>,
 <textattack.attack_results.failed_attack_result.FailedAttackResult at 0x7f3be85a8820>,
 <textattack.attack_results.failed_attack_result.FailedAttackResult at 0x7f3bef64c040>,
 <textattack.attack_results.failed_attack_result.FailedAttackResult at 0x7f3d0d2abf10>,
 <textattack.attack_results.successful_attack_result.SuccessfulAttackResult at 0x7f3bec9ef2b0>,
 <textattack.attack_results.skipped_attack_result.SkippedAttackResult at 0x7f3bf14ce490>,
 <textattack.attack_results.failed_attack_result.FailedAttackResult at 0x7f3bf6e73ac0>,
 <textattack.attack_results.successful_attack_result.SuccessfulAttackResult at 0x7f3bf9a23f10>,
 <textattack.attack_results.successful_attack_result.SuccessfulAttackResult at 0x7f3be944fe50>]

Now let’s visualize our 5 successes in color:

[30]:
import pandas as pd
pd.options.display.max_colwidth = 480 # increase column width so we can actually read the examples

from IPython.core.display import display, HTML

logger = attacker.attack_log_manager.loggers[0]
successes = logger.df[logger.df["result_type"] == "Successful"]
display(HTML(successes[['original_text', 'perturbed_text']].to_html(escape=False)))
original_text perturbed_text
0 Fears for T N pension after talks Unions representing workers at Turner Newall say they are 'disappointed' after talks with stricken parent firm Federal Mogul. Fears for T N pension after talks Unions representing workers at Knapp Newall say they are 'disappointed' after talks with stricken parent firm Federal Titan.
3 Prediction Unit Helps Forecast Wildfires (AP) AP - It's barely dawn when Mike Fitzpatrick starts his shift with a blur of colorful maps, figures and endless charts, but already he knows what the day will bring. Lightning will strike in places he expects. Winds will pick up, moist places will dry and flames will roar. Foresight Driving Helps Expectations Wildfires (AP) AP - It's barely dawn when Meek Fitzpatrick starts his shift with a blur of colorful maps, figures and endless charts, but already he knows what the day will bring. Lightning will strike in places he expects. Winds will pick up, moist places will dry and flames will roar.
8 E-mail scam targets police chief Wiltshire Police warns about "phishing" after its fraud squad chief was targeted. E-mail scam targets police chief Wiltshire Constabulary warns about "phishing" after its fraud squad chief was targeted.
11 Apple Launches Graphics Software, Video Bundle LOS ANGELES (Reuters) - Apple Computer Inc.<AAPL.O> on Tuesday began shipping a new program designed to let users create real-time motion graphics and unveiled a discount video-editing software bundle featuring its flagship Final Cut Pro software. Apple Startup Charting Software, Film Pooling LOS FRESNO (Msnbc) - Apple Team Inc.<AAPL.s> on Friday began shipping a new program designed to let users create real-time motion graphics and unveiled a discount video-editing software bundle featuring its flagship Conclude Cuts Careers software.
12 Dutch Retailer Beats Apple to Local Download Market AMSTERDAM (Reuters) - Free Record Shop, a Dutch music retail chain, beat Apple Computer Inc. to market on Tuesday with the launch of a new download service in Europe's latest battleground for digital song services. Dutch Retailer Beats Abel to Local Absolution Market AMSTERDAM (Reuters) - Free Registering Depot, a Dutch music retail chain, beat Cobbler Typewriters Inc. to market on Tuesday with the launch of a new download service in Europe's latest battleground for digital song services.

Conclusion

Our constraint seems to have done its job: it filtered out attacks that did not swap out a named entity for another, according to the NLTK named entity detector. However, we can see some problems inherent in the detector: it often thinks the title of the news article or the first word of a given sentence is a named entity, probably due to capitalization.