Previous

Next


03. The earliest match wins

  • A simplistic view is that when trying to match, a regex engine starts at the first position in the string, and tries to match the regex.
  • If it matches, great! It's done right there. If it doesn't match, then it moves along one character, and tries the regex starting from the next position. This is sometimes called "bump"ing along.
  • (While we are considering this somewhat simplified view, it may be of interest to mention that if the regex engine bumps along to a point where the regex no longer matches, it can backtrack to an earlier point and try again from there - but we will go over exactly how this happens a little later. For now, take my word for it.)
  • So, for example, when trying to match the regex:
      fanta
    
    On the string:
      This isn't fantastic, let's drink fanta!
    
    The regex will match:
      This isn't fantastic, let's drink fanta!
                 ^^^^^
    
  • So, the earliest match has won - it's the "fanta" in "fantastic" that it matches, and not the word fanta.

Previous

Next

Andrew Hill

For LinuxSA Meeting, 17 April 2001