16. Greediness and backtracking
- Okay, now you know how the NFA engine works by trying the regex at the start of
the target string, and bumping along the target as it needs to, and backtracking
through the regex when it reaches a dead end.
- However, you will need to be careful with backtracking and greediness - don't get
caught out thinking that backtracking is going to happen in places where it won't.
Let's think about this example. We want to search this string:
Some 76294 number
with this regex:
([0-9]+)
- What do we get in $1?
Position in the regex | Position in the string
----------------------------------------------
([0-9]+) | Some 76294 number
^^^ |^
----------------------------------------------
([0-9]+) | Some 76294 number
^^^ | ^
----------------------------------------------
([0-9]+) | Some 76294 number
^^^ | ^
----------------------------------------------
etc.
----------------------------------------------
([0-9]+) | Some 76294 number
^^^ | ^
----------------------------------------------
([0-9]+) | Some 76294 number
^^^ ^ | ^^
----------------------------------------------
([0-9]+) | Some 76294 number
^^^ ^ | ^^^
----------------------------------------------
([0-9]+) | Some 76294 number
^^^ ^ | ^^^^
----------------------------------------------
([0-9]+) | Some 76294 number
^^^ ^ | ^^^^^
----------------------------------------------
([0-9]+) | Some 76294 number
^^^ ^ | ^^^^^^
----------------------------------------------
- Whoops! That's not right, so the NFA engine
backtracks to give our final result of:
----------------------------------------------
([0-9]+) | Some 76294 number
^^^ ^ | ^^^^^
----------------------------------------------
with "76294" in $1, just like we'd expect.
- So what happens if we searched with
([0-9]*)
instead?
|