Main Page

Character classes

Character classes
Character classes are groups of characters to test for. By enclosing characters inside of square brackets,
you are effectively telling the regular expression to match the first character, the second character, the
third character, or so on. For example, to match the characters a, b, and c, the character class is
[abc]
.
This is called a
simple class
, because it specifies the exact characters to look for.
Simple classes
Suppose you want to match
“bat”
,
“cat”
, and
“fat”
. It is very easy to use a simple character class for
this purpose:
var sToMatch = “a bat, a Cat, a fAt baT, a faT cat”;
var reBatCatRat = /[bcf]at/gi;
var arrMatches = sToMatch.match(reBatCatRat);
The
arrMatches
array is now be filled with these values:
“bat”
,
“Cat”
,
“fAt”
,
“baT”
,
“faT”
, and
“cat”.
You can also include special characters inside simple classes (and any other type of character
class as well). Suppose you replace the
b
character with its Unicode equivalent:
var sToMatch = “a bat, a Cat, a fAt baT, a faT cat”;
var reBatCatRat = /[\u0062cf]at/gi;
var arrMatches = sToMatch.match(reBatCatRat);
This code behaves the same as it did in the previous example.
Negation classes
At times you may want to match all characters except for a select few. In this case, you can use a
negation
class
, which specifies characters to exclude. For example, to match all characters except
a
and
b
, the charac-
ter class is
[^ab]
. The caret (
^
) tells the regular expression that the character must not match the characters
to follow.
Going back to the previous example, what if you only wanted to get words containing
at
but not begin-
ning with
b
or
c
?
var sToMatch = “a bat, a Cat, a fAt baT, a faT cat”;
var reBatCatRat = /[^bc]at/gi;
var arrMatches = sToMatch.match(reBatCatRat);
In this case,
arrMatches
contains
“fAt”
and
“faT”
, because these strings match the pattern of a
sequence ending with
at
but not beginning with
b
or
c
.
Range classes
Up until this point, the character classes required you to type all the characters to include or exclude.
Suppose that you want to match any alphabet character, but you really don’t want to type every letter
in the alphabet. Instead, you can use a
range class
to specify a range between
a
and
z
:
[a-z]
. The key
here is the dash (
-
), which should be read as
through
instead of
minus
(so the class is read as
a through z
not
a minus z
).
199
Regular Expressions
10_579088 ch07.qxd 3/28/05 11:38 AM Page 199


JavaScript EditorFree JavaScript Editor     Ajax Editor


©