[Israel.pm] run many regexes
Shlomi Fish
shlomif at iglu.org.il
Sun May 4 12:54:53 EEST 2008
On Sunday 04 May 2008, Pinkhas Nisanov wrote:
> On Sat, May 3, 2008 at 9:58 AM, Gaal Yahas <gaal at forum2.org> wrote:
> > Yes, the regular expression was improved, and trie optimizations were
> > introduced. To take advantage of it you'll need to build a single RE
> > our of your many posiblitities:
> >
> > my $any_expression = "(" . (join "|", @expressions) . ")";
> > my $any_re = qr/$any_expression/;
> >
> > for my $input (@inputs) {
> > print "match: $input" if $input =~ $any_re;
> > }
> >
> > The point here is that 5.10 is better at optimizing $any_re than
> > previous perls; if several expressions shared the same prefix you'll
> > get less backtracking. E.g. you have "banana" and "bandanna",
> > internally the matching will be for "ban(?:ana|danna)". This is true
> > even when only a small number of the various @expressions share
> > prefixes.
>
> I try this code for ~700 regexes and it run 2-3 times faster!!!
> Does any other programming language has this feature?
>
I think most programming languages that support regexes (e.g: Python, C, Java)
require you to compile the regexes first and only then execute them. (Though
they may sometimes have convenience functions.) In Perl 5 the regex match is
part of the syntax of the language, rather than a library call (for
convenience and Huffmanisation.).
Regards,
Shlomi Fish
-----------------------------------------------------------------
Shlomi Fish http://www.shlomifish.org/
First stop for Perl beginners - http://perl-begin.org/
The bad thing about hardware is that it sometimes work and sometimes doesn't.
The good thing about software is that it's consistent: it always does not
work, and it always does not work in exactly the same way.
More information about the Perl
mailing list