Previous

Next


18. Greediness always favours matching

  • Here is another classic example of where you need to be careful with greediness. Let's say you have some floating point numbers that you want to truncate to something reasonable - say to 2 decimal places if the third decimal place is zero, or to three decimal places otherwise.
  • That seems pretty straight forward:
      #!/usr/bin/perl
      $number = "2.3451245678";
      $number =~ s/(\.\d\d[1-9]?)\d*/$1/;
      print "$number\n";
      $number = "5.190653417532";
      $number =~ s/(\.\d\d[1-9]?)\d*/$1/;
      print "$number\n";
    
  • In other words, look for the literal decimal point, then two digits, then an optional "[1-9]", keeping all of that in $1, and finally any more digits, replacing it all with what we kept in $1. Seems pretty straight forward, huh?
  • Indeed, the result is what we want:
      2.345
      5.19
    
  • But what happens if we were to give it a number that was already truncated like we want? Say "2.562"? Well, the required literal decimal place, digit, digit, and then optional digit from "[1-9]" match, and then the "\d*" has nothing to match, and thus settles for matching zero times, and then the substitution replaces the number with itself - essentially a no-op.
  • So, it's pretty tempting to re-write the regex so that it only does the substitution when it's necessary, and leaves already truncated numbers alone.
  • Something like:
      $number =~ s/(\.\d\d[1-9]?)\d+/$1/;
    
    makes sense, yeah? Let's make those digits on the end that we don't want required, so that it only does the substitution when they are there.

Previous

Next

Andrew Hill

For LinuxSA Meeting, 17 April 2001