28. Metacharacters - Grouping Revisited
- The other thing that grouping with brackets lets us do is called backreferencing.
- Say we wanted to find not just the phone numbers with "8989", but with any set of repeated
numbers. Here's another list of numbers
(example05.txt):
8989 4325
8998 0897
8282 9834
8715 8989
8651 0000
- Using the command:
egrep '([0-9]{2,4})\1' example05.txt
we get:
8989 4325
8282 9834
8715 8989
8651 0000
- Notice how this regex uses the interval {2,4} because we want to search for a set of repeated numbers
(i.e. at least two numbers repeated). We then use the backreference
\1
to ask the regex to match the same characters that it matched in the first group of brackets (again).
- Of course, because of the spaces in the data, the most that will match for the interval will be
two in this case, so the interval {2} would be just as effective in this example.
- \1 is the first group, \2 is the second, and so on.
- Some tools, like Perl, use $1, $2, etc, rather than \1 for backreferencing.
- Finally, remember that some tools might require \( and \) to get the backreferences!
|