13. The NFA Engine - An Example
- Consider this regex:
to(morrow|mmorrow|mmy)
Searching this string with tomorrow mis-spelled:
I hope this makes sense tommorrow.
- Let's work through the example - in this case, I've used just the
"^" character to show what is being matched on both sides. As with
before, the regex engine is starting at the start of line position:
Position in the regex | Position in the string
------------------------------------------------------------
to(morrow|mmorrow|mmy) | I hope this makes sense tommorrow.
^ |^
------------------------------------------------------------
to(morrow|mmorrow|mmy) | I hope this makes sense tommorrow.
^ | ^
------------------------------------------------------------
to(morrow|mmorrow|mmy) | I hope this makes sense tommorrow.
^ | ^
------------------------------------------------------------
to(morrow|mmorrow|mmy) | I hope this makes sense tommorrow.
^ | ^
------------------------------------------------------------
etc.
------------------------------------------------------------
to(morrow|mmorrow|mmy) | I hope this makes sense tommorrow.
^ | ^
------------------------------------------------------------
to(morrow|mmorrow|mmy) | I hope this makes sense tommorrow.
^^ | ^^
------------------------------------------------------------
to(morrow|mmorrow|mmy) | I hope this makes sense tommorrow.
^ | ^
------------------------------------------------------------
- Notice the engine has backtracked in the regular expression when it
has found that the next character does not match. The engine goes back to the
last place where a partial successful match occurred, and tries other options
from there.
- In this case, it goes back to the place where just the "t" matched, and the
only possible thing it can do from there is to bump along to the next character.
------------------------------------------------------------
to(morrow|mmorrow|mmy) | I hope this makes sense tommorrow.
^ | ^
------------------------------------------------------------
to(morrow|mmorrow|mmy) | I hope this makes sense tommorrow.
^ | ^
------------------------------------------------------------
etc.
------------------------------------------------------------
to(morrow|mmorrow|mmy) | I hope this makes sense tommorrow.
^ | ^
------------------------------------------------------------
to(morrow|mmorrow|mmy) | I hope this makes sense tommorrow.
^^ | ^^
------------------------------------------------------------
to(morrow|mmorrow|mmy) | I hope this makes sense tommorrow.
^^ ^ | ^^^
------------------------------------------------------------
to(morrow|mmorrow|mmy) | I hope this makes sense tommorrow.
^^ ^^ | ^^^^
------------------------------------------------------------
- Again, when the next character in the string doesn't match, the regex engine
backtracks to the last place in the regex where it was able to match, and
carries on from there.
------------------------------------------------------------
to(morrow|mmorrow|mmy) | I hope this makes sense tommorrow.
^^ | ^^
------------------------------------------------------------
to(morrow|mmorrow|mmy) | I hope this makes sense tommorrow.
^^ ^ | ^^^
------------------------------------------------------------
to(morrow|mmorrow|mmy) | I hope this makes sense tommorrow.
^^ ^^ | ^^^^
------------------------------------------------------------
etc.
------------------------------------------------------------
to(morrow|mmorrow|mmy) | I hope this makes sense tommorrow.
^^ ^^^^^^^ | ^^^^^^^^^
------------------------------------------------------------
- Great! The NFA engine also matches!
|