[Israel.pm] A simpler regex required
Yuval Yaari
yuval at windax.com
Wed Aug 15 08:50:32 EEST 2007
Peter Gordon wrote:
> Hi.
>
> Let's suppose that I have the following lines in an HTML file.
> I want to substitute the spaces in the date part with non-breaking spaces ( )
>
> <td style="text-align: left" bgcolor="#92c1bb">Aug 12 23:59:59 2007 GMT</td>
> <td style="text-align: left" bgcolor="#92c1bb">Aug 12 23:59:59 2007 GMT</td>
>
> I came up with this line - but somehow it isn't aesthetic.
>
> s!(<td.*?>)(.*?)(</td>)!my $t1 = $1 ;my $t2 = $2 ; my $t3 = $3 ; $t2 =~ s/\s/ /g ; "$t1$t2$t3" ;!egs ;
>
> Is there a nicer/cleaner way to write it?
>
Using look-ahead and look-behind, I guess.
Oh, yeh, only variable length look-behind won't work in Perl 5.8.8... :-(
I do like this trick in blead, though:
s{<td.*?> \K (.+?) (?=</td>)}
{(my $text = $1) =~ s/\s/ /g; $text}egx;
\K is described in perlre as "Keep the stuff left of the \K, don't
include it in $&".
HTH (guess not :-)),
~Y
More information about the Perl
mailing list