Re: replace chars



On Dec 26, 2007 2:59 PM, Gunnar Hjalmarsson <noreply@xxxxxxxxx> wrote:
[ Please only quote what's necessary to give context. ]
[ Please don't top-post. ]

Octavian Rasnita wrote:
Gunnar Hjalmarsson wrote:
I believe the OP will need to identify all the characters he would like
to see converted, and code the conversion rules himself using the tr///
or s/// operator.

Yes I think that it might not be any standard transforming algorithm for
doing this, and the program that do that, do their own transform.
So finally I've decided to try finding all the possible chars with
tildes, acute or grave accents, umlauts, etc, and replace using tr//.

I hope I won't have any issues, because the chars are UTF-8.

Well, then you'll probably need to identify the utf8 octet sequences
that correspond to the special characters you want to see transformed.
snip

Perl strings are in UTF-8*, but if you want to specify a character
without using it directly (so the Perl file can still be treated as
ASCII) you use the UNICODE representation instead:

my $a_with_macron = "\x{0101}"; #UTF-8 encoding is C4 81

So, knowing the UTF-8 sequences is fairly useless.

* Well, for sufficiently recent versions of Perl.
.