Re: how to convert the characters ASCII(0-255) to ASCII(0-127)



Alextophi wrote:
EXAMPLE:

the log "C:\WINDOWS\SchedLgU.Txt", contains wide ASCII characters (ex:
"tâche" or "système"),

$LINE = ~ tr/\x8A/\x65 /; #  remplace ... è > e
$LINE = ~ tr/\x83/\x61 /; # remplace ... â > a

- how to replace all the ASCII characters?

Are they wide ASCII, or extended ASCII? Your example (and your subject line) are talking about extended, not wide, characters. BTW, your code fragment can be shorted to this:
$LINE = ~ tr/\x8A\x83/\x65\x61/;


What you want to do is a lossy transformation, so I doubt that there's any one "right" way to do it. From your example, I'd use this page:
http://www.cplusplus.com/doc/papers/ascii.html
and hand-build a 'tr' that does what you want. \xC0 through \xFF are fairly easy, the fun part is deciding what you want to do with "copyright" and "registered". If you'll be translating characters into strings ("copyright" into "(C)" and/or HTML entities) then you want a substitution table:


  my %xlate = (
    "\xA9" -> "(C)",
    "\xAE" -> "(R)",
    "\xB1" -> "+/-",
    # add more lines as desired
  );
  my $from = join('', keys %xlate);
  # ...
  $input =~ s/([$from])/$xlate{$1}/ego;
.



Relevant Pages

  • Re: Alt-0nnn doesnt work in Word
    ... >my bewilderment. ... This is the first time I've ever heard of this problem, ... BTW, I checked, and there are no global templates or ... its characters, but I can't imagine how it does them any ...
    (microsoft.public.word.printingfonts)
  • s in REs: Bug or feature?
    ... Should I file a bug report or is this a feature? ... BTW, I didn't find ... documented which characters are supposed to be matched by '\s'. ...
    (comp.lang.tcl)
  • Re: GH - Great Writing, Staff!
    ... Hey, kids, no need to get tense over my dumb-assed post. ... having fun here. ... BTW, NO comments about my thoughts on the characters? ...
    (rec.arts.tv.soaps.abc)
  • Re: Captain America Dead?!!?
    ... which tries to clear up the canon while eliminating a bunch of has- ... been characters? ... BTW, I am not saying Captain America is a H.B., but ...
    (rec.sport.pro-wrestling)
  • Re: Quality
    ... In article, Ken Ward ... BTW. ... What was the name of the characters that could not bee seen but you ...
    (uk.radio.amateur)