Escaping Special Characters in PHP Regex
Suppose we want a special character to represent itself. To do this, it must be escaped with a backslash. Let's look at some examples.
Example
In the following example, the regex author wanted
the search pattern to look like this: the letter
'a'
, then a plus '+'
, then
the letter 'x'
. However, the code author
did not escape the '+'
character, so
the search pattern actually looks like this:
the letter 'a'
one or more times, then
the letter 'x'
:
<?php
$str = 'a+x ax aax aaax';
$res = preg_replace('#a+x#', '!', $str);
?>
As a result, the following will be written to the variable:
'a+x ! ! !'
Example
Now the author has escaped the plus with a backslash.
Now the search pattern looks as it should: the letter
'a'
, then a plus '+'
, then the letter 'x'
.
<?php
$str = 'a+x ax aax aaax';
$res = preg_replace('#a\+x#', '!', $str);
?>
As a result, the following will be written to the variable:
'! ax aax aaax'
Example
In this example, the pattern looks like this: the letter
'a'
, then a dot '.'
, then
the letter 'x'
:
<?php
$str = 'a.x abx azx';
$res = preg_replace('#a\.x#', '!', $str);
?>
As a result, the following will be written to the variable:
'! abx azx'
Example
In the next example, the author forgot to escape the slash and the regex matched all substrings, because an unescaped dot represents any character:
<?php
$str = 'a.x abx azx';
$res = preg_replace('#a.x#', '!', $str);
?>
As a result, the following will be written to the variable:
'! ! !'
Example
Note that if you forget the backslash for a dot (when it should represent itself) - you might not even notice it:
<?php
preg_replace('#a.x#', '!', 'a.x'); // returns '!', as we wanted
?>
Visually it works correctly (since the dot
represents any character, including a regular
dot '.'
). But if we change the string
in which the replacements occur - we will see our
mistake:
<?php
preg_replace('#a.x#', '!', 'a.x abx azx'); // returns '! ! !', but '! abx azx' was expected
?>