Re: replace chars



Yes I think that it might not be any standard transforming algorithm for doing this, and the program that do that, do their own transform.
So finally I've decided to try finding all the possible chars with tildes, acute or grave accents, umlauts, etc, and replace using tr//.

I hope I won't have any issues, because the chars are UTF-8.

Thanks.

Octavian

----- Original Message ----- From: "Gunnar Hjalmarsson" <noreply@xxxxxxxxx>
To: <beginners@xxxxxxxx>
Sent: Wednesday, December 26, 2007 7:33 PM
Subject: Re: replace chars


Tom Phoenix wrote:
On Dec 26, 2007 3:05 AM, Octavian Rasnita <orasnita@xxxxxxxxx> wrote:
I want to replace some special characters with their corresponding Western
European chars, for example a with a, â with a, s with s, t with t, î with i
and so on.

I thought that all those characters were included in the Western European character set ISO-8859-1, and if so, your requirement makes no sense. Do you possibly mean corresponding ASCII characters?

Could you please recommend a module that can do this?

You might be able to do what you want with Encode.

http://perldoc.perl.org/Encode.html

Might he? How?

The Swedish alphabet contains three non-ascii characters: å, ä and ö. To my knowledge, there is no official encoding scheme that converts them to a, a and o respectively. That's natural, since 'å' is a completely different character than 'a' etc.

Sometimes, the special Swedish characters are converted in an English context, and based on how they are pronounced, like this:

å -> ou
ä -> ae
ö -> oe

I believe the OP will need to identify all the characters he would like to see converted, and code the conversion rules himself using the tr/// or s/// operator.

--
Gunnar Hjalmarsson
Email: http://www.gunnar.cc/cgi-bin/contact.pl

--
To unsubscribe, e-mail: beginners-unsubscribe@xxxxxxxx
For additional commands, e-mail: beginners-help@xxxxxxxx
http://learn.perl.org/



.



Relevant Pages

  • Re: Looking for an alternative to L<text>
    ... This could be wrong if str contains cp1252 characters; wstring ... conversion from Latin-1. ... euro sign and chars in the range ...
    (microsoft.public.vc.language)
  • Re: unix filename restriction
    ... underscore _ and the dot. ... You could use other characters, ... never met a *human* defined path of more than 80 chars (most of the ... If you want to be very sure, limit your filename to the DOS limit of 8 ...
    (comp.os.linux.development.apps)
  • Re: 128 bit password
    ... Joe Richards Microsoft MVP Windows Server Directory Services ... Assumption would be that it would get truncated at 127/128 characters... ... then these are 16 bit chars. ...
    (microsoft.public.security)
  • [4E] Keep on the Shadowfell - pregen chars
    ... Played our first game of 4E at home. ... We used the pre-gen chars. ... To put it bluntly, they hated the characters. ... I got some funny looks when they dropped a kobold minion with 2 points ...
    (rec.games.frp.dnd)
  • Re: HELP!!
    ... to memory you "own" (i.e. it's either an array of chars you defined ... and has it at least room for FILES characters? ... sprintf (txtFile, "%c", c); ...
    (comp.lang.c)