Previous

Next


22. Why use Traditional NFA?

  • So, now that you know how the DFA and NFA engines work, you might be wondering why you'd bother to use a Traditional NFA engine? Why not just use DFA, or at least POSIX NFA, and be done with it?
  • There are some trade offs:

  • DFA Advantages:
    1. Always finds the longest leftmost match.
    2. Faster to run - because it's "deterministic", it's only checking each position in the target string once, and working out all the possible matches in parallel, so it doesn't have to go over the string multiple times like an NFA engine.
  • DFA Disadvantages:
    1. Slower to start up, as has to pre-process string to make a memory area to keep track of all those potential matching positions - this can be pretty big for complex regexes.
    2. In general, because a DFA engine doesn't "remember" where the "current" match attempt is up to, because it's doing all match attempts in parallel, it can't do backreferencing! (Okay, some DFA engines can.)


  • Traditional NFA Advantages:
    1. There are lots of engines that use the NFA engine - so they are commonly available.
    2. Backreferencing is not a problem.
    3. You can take advantage of it's problems with greediness and non-greedy alternation to write clever regexes that find things you can't find with a DFA or POSIX NFA engine.
    4. You can also write a regex in a manner to make the regex run faster, based on your knowledge of how it works.
  • Traditional NFA Disadvantages:
    1. May not always find the longest leftmost match.
    2. Slow to run - it may have to scan parts of the target string over and over again as it backtracks through parts of the regex.
    3. Not always sufficiently greedy.


  • POSIX NFA Advantages:
    1. Always finds the longest leftmost match.
    2. Otherwise, much like the Traditional NFA.
  • POSIX NFA Disadvantages:
    1. Can be even slower to run than the Traditional NFA engine.
    2. As with DFA engines, not may provide backreferencing support.

Previous

Next

Andrew Hill

For LinuxSA Meeting, 17 April 2001