[Israel.pm] run many regexes
Yitzchak Scott-Thoennes
sthoenna at efn.org
Sun May 4 12:56:13 EEST 2008
On Sun, May 4, 2008 2:21 am, Pinkhas Nisanov wrote:
> On Sat, May 3, 2008 at 9:58 AM, Gaal Yahas <gaal at forum2.org> wrote:
>
>> Yes, the regular expression was improved, and trie optimizations were
>> introduced. To take advantage of it you'll need to build a single RE our
>> of your many posiblitities:
>>
>> my $any_expression = "(" . (join "|", @expressions) . ")"; my $any_re =
>> qr/$any_expression/;
>>
>> for my $input (@inputs) { print "match: $input" if $input =~ $any_re; }
>>
>>
>> The point here is that 5.10 is better at optimizing $any_re than
>> previous perls; if several expressions shared the same prefix you'll get
>> less backtracking. E.g. you have "banana" and "bandanna", internally the
>> matching will be for "ban(?:ana|danna)". This is true even when only a
>> small number of the various @expressions share prefixes.
>>
>
> I try this code for ~700 regexes and it run 2-3 times faster!!!
> Does any other programming language has this feature?
On 5.8.8 or 5.10? In 5.10, with the trie optimization, I'd expect much
better than 2-3 times faster, so you may be running into a bug that
prevents the optimization from being applied sometimes.
See http://perlmonks.org/?node_id=670558
More information about the Perl
mailing list