reg exp

From: Ken Chesak (datavector_at_hotmail.com)
Date: 08/30/04


Date: 30 Aug 2004 08:59:11 -0700

Perl scipt is formatting text for HTML page. It changes things like
an & to &amp. But should not change &nbsp. It uses \ as an escape
character. So \&nbsp will become &nbsp. The final results are
correct, but is there a better way to do this?

Input file test.txt
 \HOME & \  BORN \& FREE BORN FREE ' \' HELP " \" w\\\\\\\w

1st change
1a= \HOME & \  BORN \& FREE BORN FREE '' \' HELP " \"
w\\\\\\\w
2nd changes
1b= HOME &   BORN & FREE BORN FREE '' ' HELP " "
w\\\w

#!/usr/local/bin/perl5
#
%encode = ( '&' => '&',
         '"' => '"',
         '\'' => '\'\'' );

$data = `cat test.txt`;
print "Oa= $data\n";
$data =~ s/(?<!\\)(.)/defined($encode{$1})?$encode{$1}:$1/eg;
print "1a= $data\n";
$data =~ s/(\\)(.)/$2/g;
print "1b= $data\n";

This is perl, v5.8.0 built for PA-RISC2.0 On HP-Unix.



Relevant Pages

  • Re: python-dev Summary for 2004-01-01 through 2004-01-31
    ... >> It is particularly appropriate that this item should exhibit a character ... >> encoding problem. ... I suggest using the HTML version if you want really nice formatting ...
    (comp.lang.python)
  • reg exp
    ... Perl scipt is formatting text for HTML page. ... character. ... So will become  . ...
    (comp.lang.perl)
  • Re: Newbie Question
    ... That's not XSLT behavior, but browser behavior. ... generate the HTML it's processed like any other HTML document. ... non-breaking-space character. ... I was using   in my XSLT and assumed it was perfectly safe because I'd ...
    (comp.text.xml)
  • Re: [PHP] New to PHP question
    ... have a question regarding the formatting of text. ... When I run the same code in a browser it does not put the ... Browser don't break lines on the \n character. ... lined up, etc.), the simplest solution is to use HTML tables. ...
    (php.general)
  • Re: Heading Style wont stick
    ... Null character styles, maybe? ... because the selection was always in a paragraph. ... Then users could not understand that their formatting would sometimes apply, ... appended the name "Char" to the original style name. ...
    (microsoft.public.mac.office.word)