22. Why use Traditional NFA?
- So, now that you know how the DFA and NFA engines work, you might be
wondering why you'd bother to use a Traditional NFA engine? Why not
just use DFA, or at least POSIX NFA, and be done with it?
- There are some trade offs:
- DFA Advantages:
- Always finds the longest leftmost match.
- Faster to run - because it's "deterministic", it's only checking each
position in the target string once, and working out all the possible
matches in parallel, so it doesn't have to go over the string multiple
times like an NFA engine.
- DFA Disadvantages:
- Slower to start up, as has to pre-process string to make a memory area
to keep track of all those potential matching positions - this can be
pretty big for complex regexes.
- In general, because a DFA engine doesn't "remember" where the "current"
match attempt is up to, because it's doing all match attempts in parallel,
it can't do backreferencing! (Okay, some DFA engines can.)
- Traditional NFA Advantages:
- There are lots of engines that use the NFA engine - so they are
commonly available.
- Backreferencing is not a problem.
- You can take advantage of it's problems with greediness and
non-greedy alternation to write clever regexes that find things
you can't find with a DFA or POSIX NFA engine.
- You can also write a regex in a manner to make the regex run
faster, based on your knowledge of how it works.
- Traditional NFA Disadvantages:
- May not always find the longest leftmost match.
- Slow to run - it may have to scan parts of the target string over and over
again as it backtracks through parts of the regex.
- Not always sufficiently greedy.
- POSIX NFA Advantages:
- Always finds the longest leftmost match.
- Otherwise, much like the Traditional NFA.
- POSIX NFA Disadvantages:
- Can be even slower to run than the Traditional NFA engine.
- As with DFA engines, not may provide backreferencing support.
|