[Israel.pm] run many regexes

Gaal Yahas gaal at forum2.org
Sat May 3 09:58:16 EEST 2008


On Fri, May 2, 2008 at 5:01 PM, Pinkhas Nisanov <pinkhas at nisanov.com> wrote:
> On Thu, May 1, 2008 at 6:48 PM, Gaal Yahas <gaal at forum2.org> wrote:
>  > Do you need sophisticated captures, or is there always the same amount
>  >  of positionals (or do you just need to know whether you matched or
>  >  not)?
>
>  It's simple string ( many strings ) match, I thought use "index" instead of
>  regex, that means to normalize regex strings and search string ( bring all
>  to lower case, remove multi spaces to one space, ... ) and match substring
>  by "index".
>

If you can do that, then yes, it might help. Of course, this is offset
by the need to make a copy of the input, so you'll have to benchmark
it.

>  >
>  >  Make sure you're using 5.10. Alternatively, try Regexp::Optimizer.
>
>  I use 5.8.8, is there difference between 5.8.8 and 5.10?

Yes, the regular expression was improved, and trie optimizations were
introduced. To take advantage of it you'll need to build a single RE
our of your many posiblitities:

my $any_expression = "(" . (join "|", @expressions) . ")";
my $any_re = qr/$any_expression/;

for my $input (@inputs) {
  print "match: $input" if $input =~ $any_re;
}

The point here is that 5.10 is better at optimizing $any_re than
previous perls; if several expressions shared the same prefix you'll
get less backtracking. E.g. you have "banana" and "bandanna",
internally the matching will be for "ban(?:ana|danna)". This is true
even when only a small number of the various @expressions share
prefixes.

>
>
>  >  Precompile your regexps with qr// and store the results in a wide
>  >  enough scope that you don't recompute them.
>
>  sure, I precompile my regexes ( qr// ) and store them in array,
>  then map this array on content string.
>
>
>
>
>  thanks
>  Pinkhas Nisanov
>  _______________________________________________
>  Perl mailing list
>  Perl at perl.org.il
>  http://perl.org.il/mailman/listinfo/perl
>



-- 
Gaal Yahas <gaal at forum2.org>
http://gaal.livejournal.com/


More information about the Perl mailing list