Previous

Next


05. Literal Characters

  • Literal characters are exactly that - literal. The letter "a" in a regex will match the letter "a", and nothing else. Similarly, the letter "A" in a regex will match the letter "A", and a "0" will match a "0", and so on.
  • So, we could write a regex that looked like this:
      This is a regular expression
            
  • Yes, a regex made up of just literal characters is pretty boring, and not very powerful! Still, they have their place. You've probably even used a regex like this before (even though you may not have known it) when you did something like this:
      cd /usr/src/linux
      grep 'swear word' *
            
    to find all those nasty words in the kernel source code. See? All you did was feed grep a regex made up of just literal characters!
  • However, before we go on, this is a good time to point out that the regular expression above doesn't match the words "This is a regular expression". Regular expressions have no concept of the English language, so the idea that it's matching words is not correct.
  • Rather, the regex matches a "T", then matches an "h", then an "i", and so on.
  • It's a good idea to start thinking about regexes in this way, because it will force us to look very carefully at the regex, and it will help later on when we talk about the way that regex engines work.

Previous

Next

Andrew Hill

For LinuxSA Meeting, 21 November 2000