Re: how to convert the characters ASCII(0-255) to ASCII(0-127)



On Thu, 29 Dec 2005, Samwyse wrote:

> Alextophi wrote:
> > EXAMPLE:
> >
> > the log "C:\WINDOWS\SchedLgU.Txt", contains wide ASCII characters

There's no such thing. ASCII is definitively a 7-bit character
coding: it has no character positions above 127 (nor any displayable
characters above 126).

There are countless 8-bit character codings which contain the ASCII
characters in their lower half: each one of them that has been
published has a definitive name. You can't make sense of an arbitrary
stream of bytes unless and until you know just which coding you are
dealing with. In this sense, it only spreads confusion to talk about
"8-bit ASCII" or "wide ASCII" or "extended ASCII" as if those terms -
apparently made-up for convenience by somebody who's never been
exposed to the full range of codings - might designate an actual
character coding.

Are you attempting to designate an MS-DOS code page? - it seems that
you are - for example, it might be codepage 437, the US National
MS-DOS code page, which is consistent with your presentation, but so
would other code pages, such as CP850, the "Latin1 Multinational" DOS
code page.

These, and other, MS-DOS code pages are documented at
http://www.unicode.org/Public/MAPPINGS/VENDORS/MICSFT/PC/
together with their cross-mappings into Unicode.

However, these newsgroup postings are (rightly) in iso-8859-1, which
uses very different encodings of the accented letters. So one needs
to keep a careful grasp.

> > (ex: "tâche" or "système"),
> >
> > $LINE = ~ tr/\x8A/\x65 /; # remplace ... è > e
> > $LINE = ~ tr/\x83/\x61 /; # remplace ... â > a
> >
> > - how to replace all the ASCII characters?

I read the question as really asking "how to replace all the
*non*-ASCII characters".

> Are they wide ASCII, or extended ASCII?

Please, don't do that. We readers of the group have no clear idea
which definitive character codings you are referring to under these
baby-talk names.

It's been my experience that, despite the underlying simplicity of the
topic, character coding is something which causes endless confusion,
which is only made worse by a refusal to call things by their proper
names.

> Your example (and your subject line)
> are talking about extended, not wide, characters.

As I say: out of what I'd interpret as plausible interpretations of
8-bit ASCII-based codes (MS-DOS code pages, or iso-8859-something, or
Windows-125x), the evidence points to an MS-DOS code page. If we're
dealing with a Western context, then more precisely we'd be dealing
with MS-DOS either CP437 or 850, or iso-8859-1, or Windows-1252.

> http://www.cplusplus.com/doc/papers/ascii.html

Hmmm, this chap also uses baby talk instead of the proper names of
things.

I've no argument with your code fragments, provided that the
questioner has properly identified which MS-DOS code page they are
dealing with; but I do urge you please, in an international forum, to
use terms which make proper sense internationally.

regards
.



Relevant Pages

  • Re: Why (or not) use single letter variable names?
    ... That compromise is quite remote from optizing for coding speed or run ... you don't have the remotest concept of ... than your 30 character value. ... FSVO long considerably larger than your six characters. ...
    (sci.math)
  • Re: Base64-Encoding
    ... Now for two algorithms to convert data to Base64: ... -- Obtain conversion character ... Temp: Base64_Array; ... begin -- Coding ...
    (comp.lang.ada)
  • Disabling XML attributes in TEX
    ... Currently I am creating a .cls file based on XML coding (not based on ... In this, I would like to disable spacings(normal character space), ...
    (comp.text.tex)
  • Re: who knows the meaning of ??
    ... Your posting was in GB2312 coding, as was Ines' original. ... Hmmm. ... You posted a GB2312 character into an EUC-JP form? ...
    (sci.lang.japan)
  • Re: How do I overline text?
    ... For any character for which a macron is part of a recognised language, ... Coding the field as per the above has the advantage of not increasing the ...
    (microsoft.public.word.docmanagement)