Regex Metacharacters
Just a quickie today that will set us up for additional Regex Twitchiness in just a bit.
There are certian characters that have special meaning in the syntax of a regular expression. These are: <pre>{}[]()^$.|*+?\</pre>
I'm not going to go over the meaning of each of those now, but I will talk about escaping those within a regular expression. Consider:
<pre>"2+2=5" =~ /2+2/; </pre>This expression doesn't match. I said above that "+" means "one or more of". So it's looking for one or more 2's followed by a 2. No such match in the given string.
So what do you do when you mean a literal + ? You escape that plus-sign with a backslash:
<pre>"2+2=5" =~ /2\+2/;</pre>THAT expression matches.
<pre>"The interval is [0.1}." =~ /[0,1}./;</pre>That's a syntax error. ("unmatched [ in regex", to be specific)
<pre>"The interval is [0,1}." =~ /\[0,1\)\./;</pre>That one's perfectly legal AND a match.
Sometimes when you have \ or / characters in the string you're trying to match. You have to escape those too. This can lead to LTS (Leaning Toothpick Syndrome), which is one of the things that can make regexes hard on the eyes.
|