Previous

Next


15. Metacharacters - Dot

  • Yes, the humble dot just happens to be regex shorthand for "any character". (Well, almost... In some tools the dot will not match on a null and/or something else - we'll get there soon!)
  • Say we wanted to search some files for a phone number, 08 1234 5678.
  • The only problem is, we're not sure if the phone number has been written as "08 1234 5678", or "08 1234-5678" or even "(08)1234 5678".
  • Fortunately, dot comes to the rescue!
      08.1234.5678
            
    will match any of those phone numbers!
  • Okay, the same regex will also match something like "08/1234%5678" or "088123405678" which we don't want. So, we could have used the regex:
      08[ )]1234[ -]5678
            
    to match any of the three examples above.
  • That second regex isn't very easy to read though, and it's pretty specific - it won't match the phone number if it happened to be written as "08-1234-5678".
  • So, as mentioned before, regexes are a balance between matching what we do want vs. not matching what we don't - and a lot of the time, knowing a little bit about the data we are searching with our regex will help us to decide on how best to "phrase" our regular expression.

Previous

Next

Andrew Hill

For LinuxSA Meeting, 21 November 2000