Part II: Core ActionScript 3.0 Data Types in Java

Part II: Core ActionScript 3.0 Data Types
var fakingBoundary:RegExp = /\Wzarflax\W/i; trace(garbage.match(fakingBoundary)); //null trace(dialogue.match(fakingBoundary)); // Zarflax!
In the second paragraph of this snippet, you see that not using word boundaries matches the word zar ax even when it is contained within garbage, which in this case is a false positive. Adding word boundaries to the pattern causes the garbage string not to match. In the third paragraph, you test against the intended usage. Both with and without boundaries, the word is found. In the nal paragraph of the snippet, you can see why using a zero-width word anchor is different from including the surrounding nonword characters in the pattern. Doing so causes the garbage string not to match because there is no instance of zar ax that has nonword characters before and after it in the garbage string. So far so good you ve eliminated the false positive. But when you attempt to match the pattern in the dialogue string, you end up picking up the nonword characters you required zar ax to be surrounded in as part of the pattern. The bene t of \b here is that it matches a word boundary without consuming it into the pattern. Table 12-4 summarizes the anchors available to regular expressions in ActionScript 3.0.
TABLE 12-4
Anchors and Boundaries
Anchor Meaning
^ $ \b \B
Beginning of string, or beginning of line when multiline ag is set End of string, or end of line when multiline ag is set Word boundary; after nonword character and before word character Not word boundary; between two word characters or between two nonword characters
Using a pipe (|) character in a regular expression allows you to match multiple alternatives. You saw this at play earlier in the expression:
var friends:RegExp = /leigh|mariko|neal|oskar|paula/gi;
You can use one pipe to allow two options, or multiple pipes to allow many options. This example matches the name of any of these friends. Alternation can be even more useful when you can specify parts in which alternates are allowed instead of alternating the entire expression. You do this with groups.
12: Regular Expressions
Grouping can be used for several purposes. You ve already seen that it can capture information out of a speci c context, using the pattern at large to match the entire format and the group to make the text you re really interested in available after the match. Use parentheses around part of the pattern to make a group out of it. This kind of group is a capturing group because it is captured for later use.
If you want to include parentheses in your expression, to match actual parentheses in the input text, you must escape the parentheses. Use \( to match an open parenthesis and \) to match a close parenthesis.
You can also use groups as the container for alternates or repetition:
var rhymes:RegExp = /\b(tr|r|sp|b)a(c|s)e\b/gi; var str:String = "trace() the race to the orbiting base in outerspace"; trace(str.match(rhymes)); //trace,race,base
This example goes further than a character class could. The (c|s) alternate could be replaced by [cs], but the beginning of a rhyming word can be one or two letters when you use alternates like this. In fact, an alternate can be any pattern, not just these simple letter sequences. The two groups in this expression act as a scope for the alternates. If any of the alternates in the rst group are ful lled, the string and regular expression both advance to the next character. So you can look at the group as a subpattern, a single special kind of match that is self-contained. Using groups for scoping purposes also allows you to apply quanti ers to subpatterns, as in the following example:
