(R) character in RegEXP

From: Tsu-na-mi (tsunami_at_zedxinc.com)
Date: 04/29/04


Date: 29 Apr 2004 12:32:12 -0700

Hi,

I am having trouble getting a simple regexp to recognize the
registered trademark symbol (R) when it is read from XML. The XML
uses ® for the symbol, and if I print the string after parsing,
it prints correctly. However, the regexp:

$string =~ s/(R)/somethingelse/g;

does not recognize the (R) symbol. NOTE: (R) is the single-ASCII
character. I also tried using \x{AE} which did not work either. The
regular TM symbol doesn't work either, and seems to throw everything
into unicode mode, screwing up other stuff like the bullet and
copyright symbols.

So my question is, If I have XML like :

<P>This is my Widget&#174;</P>

And read it into a string with XML::Parser, how should I address this
character (and any char > 256 if you know).

For the record, I am using Perl 5.8.3 on Red Hat 9.0. Thanks for any
help anyone can provide.



Relevant Pages

  • Re: Non-ascii characters in VS.NET service
    ... method that takes a string parameter. ... How is it turning the character into hex? ... What do you mean by "an XML header"? ...
    (microsoft.public.dotnet.languages.csharp)
  • Re: Illegal Charaters in path
    ... I am downloading this file using ... Stripping the first character solved the problem though. ... I have a small XML file, I uploaded to a web page. ... XmlDocument.Load doesn't have an overlaod that loads XML from a string. ...
    (microsoft.public.dotnet.languages.csharp)
  • RE: System.ArgumentException: Illegal characters in path
    ... But I don't use any xml string at all in my web ... It is a default data type string and I wonder it ... cannot accept latin character since string accepts all utf-8 characters. ... Microsoft XML 3.0 SP1 ...
    (microsoft.public.dotnet.framework.webservices)
  • RE: Xml deserialization problem..help needed.
    ... "The '*' character, hexadecimal value 0x2A, cannot begin with a name. ... set of characters...in the value of an xml element. ... I am deserializing the xml data into a c# class I have created. ... All I want to do is take a string of xmldata and deserialize it into a class. ...
    (microsoft.public.dotnet.framework.webservices)
  • Re: How to parse XML which contains & in the text ?
    ... "The ampersand character and the left angle bracket MUST NOT appear in their literal form, except when used as markup delimiters, or within a comment, a processing instruction, or a CDATA section. ... bracket may be represented using the string ">", and MUST, for compatibility, be escaped using either ">" or a character reference when it appears in the string "]]>" in content, when that string is not marking the end of a CDATA section." ... You can't fix this in the DTD, the XML is invalid and the parser is correct to reject it. ...
    (comp.lang.java.programmer)