↑
Main Page

Greedy, reluctant, and possessive quantifiers

You can read this regular expression as “zero or one occurrence of

, followed by

, followed

by zero or one occurrence of

, followed by

.” The preceding regular expression is the same as this one:

var reBreadReadOrRed = /b{0,1}rea{0,1}d/;

In this regular expression, the question mark has been replaced with curly braces. Inside the curly braces

are the numbers 0, which is the minimum number of occurrences, and 1, which is the maximum. This

expression reads the same way as the previous one; it’s just represented differently. Both expressions are

considered correct.

To illustrate the other quantifiers, suppose you had to create a regular expression to match the strings

“bd”, “bad”, “baad”, and “baaad”. The following table illustrates some possible solutions and which

words each match.

Regular Expression

Matches

ba?d

“bd”, “bad”

ba*d

“bd”, “bad”, “baad”, “baaad”

ba+d

“bad”, “baad”, “baad”

ba{0,1}d

“bd”, “bad”

ba{0,}d

“bd”, “bad”, “baad”, “baaad”

ba{1,}d

“bad”, “baad”, “baad”

As you can see, only two of the six expressions adequately solve the problem:

ba*d

and

ba{0,}d

. Notice

that these two are exactly equal because the asterisk means

0 or more

just as

{0,}

does. Likewise, the first

and fourth expressions are equal, and the third and sixth expressions are equal.

Quantifiers can also be used with character classes, so if you wanted to match the strings “bead”,

“baed”, “beed”, “baad”, “bad”, and “bed”, the following regular expression would do so:

var reBeadBaedBeedBaadBedBad = /b[ae]{1,2}d/;

This expression says that the character class

[ae]

can appear a minimum of one time and a maximum

of two times.

Greedy, reluctant, and possessive quantifiers

The three kinds of regular expression quantifiers are

greedy

reluctant

, and

possessive

greedy quantifier

starts by looking at the entire string for a match. If no match is found, it eliminates

the last character in the string and tries again. If a match is still not found, the last character is again

discarded and the process repeats until a match is found or the string is left with no characters. All the

quantifiers discussed to this point have been greedy.

reluctant quantifier

starts by looking at the first character in the string for a match. If that character

alone isn’t enough, it reads in the next character, forming a string of two characters. If still no match is

202

Chapter 7

10_579088 ch07.qxd 3/28/05 11:38 AM Page 202