Re: "Escape" in perl



On Fri, 17 Oct 2008 00:18:21 GMT, sln@xxxxxxxxxxxxxxx wrote:

On Thu, 16 Oct 2008 15:21:21 -0700 (PDT), Bill H <bill@xxxxxxxxx> wrote:

I am using the following code to unescape html text that is coming
from flash:

sub unescape
{
my $text = shift;
$text =~ s/%(..)/pack("c",hex($1))/ge;
return($text);
}

for example it will take this text:

%3CFONT%20FACE%3D%22timesnewroman%22%20COLOR%3D%22#000000%22%20SIZE%3D
%2220%22%3E%3CP%20ALIGN%3D%22CENTER%22%3EChapter%20Title%3C%2FP%3E%3C
%2FFONT%3E

and it will convert it to this:

<FONT FACE="timesnewroman" COLOR="#000000" SIZE="20"><P
ALIGN="CENTER">Chapter Title</P></FONT>

What I am trying to figure out is how to go the other way in perl,
convert the html to an escaped format. Any hints, clues, pointers
would be appreciated

Bill H

This might be one way:

sub escape
{
my $text = shift;
$text =~ s/([<>= "#])/'%'.uc sprintf("%x", ord($1))/ge;
return($text);
}


Yeah, this seems to work. Try all except alpha-numeric and newline.

sln


####################
# Esc_some_html.pl
####################
use strict;
use warnings;

# Try all except alpha-numeric and newline
# --------------------------------------------
my $htmlchrs = '[^\w\n]'; # or '[<>= "#]'

my $str_escaped = '
%3CFONT%20FACE%3D%22timesnewroman%22%20COLOR%3D%22#000000%22%20SIZE%3D
%2220%22%3E%3CP%20ALIGN%3D%22CENTER%22%3EChapter%20Title%3C%2FP%3E%3C
%2FFONT%3E';


my $str_normal = unescape( $str_escaped);

print "$str_normal\n\n";

print escape( $str_normal)."\n";

sub unescape
{
my $text = shift;
$text =~ s/%(..)/pack("c",hex($1))/ge;
return($text);
}

sub escape
{
my $text = shift;
$text =~ s/($htmlchrs)/'%'.uc sprintf("%02x", ord($1))/ge;
return($text);
}

__END__

output:

<FONT FACE="timesnewroman" COLOR="#000000" SIZE=
"20"><P ALIGN="CENTER">Chapter Title</P><
/FONT>


%3CFONT%20FACE%3D%22timesnewroman%22%20COLOR%3D%22%23000000%22%20SIZE%3D
%2220%22%3E%3CP%20ALIGN%3D%22CENTER%22%3EChapter%20Title%3C%2FP%3E%3C
%2FFONT%3E



.