Previous

Next


30. Metacharacters - The Last Gasp

  • Some tools supply you with some convenient shortcuts.
  • The most common are:
\d Digit Generally [0-9]
\D Non-Digit Generally [^0-9]
\w Word Character Generally [a-zA-Z0-9], some also include "_"
\W Non-Word Character Generally [^\w]
\s White Space Character Generally [ \f\n\r\t\v] (Yes, there's a " " in there!)
\S Non-White Space Character Generally [^\s]
  • Other tools are POSIX compliant, and often supply some POSIX character classes.
  • Some that I think are interesting are:
[:upper:] Upper case alphabetic characters
[:lower:] Lower case alphabetic characters
[:alpha:] All alphabetic characters
[:alnum:] All alphabetic characters and digits
[:digit:] All digits
[:blank:] Space and tab
[:space:] All white space
[.eszet.] ß
  • The fact that a tool may or may not be POSIX compliant actually has some interesting things to do with how a regex matches - but that's going to have to wait for Part 2.

Previous

Next

Andrew Hill

For LinuxSA Meeting, 21 November 2000