Re: ASCII value of a character
From: Maarten Wiltink (maarten_at_kittensandcats.net)
Date: 07/25/04
- Next message: Bernhard Mayer: "Re: form in dll"
- Previous message: Jan Kroeze: "Re: ASCII value of a character"
- In reply to: Jan Kroeze: "Re: ASCII value of a character"
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
Date: Sun, 25 Jul 2004 10:36:57 +0200
"Jan Kroeze" <jcwk@anymail.co.za> wrote in message
news:l9p6g0hd5ia5anu86evo2tcuj37mc7sjid@4ax.com...
> "Ken" <sross3@bigpond.net.au> wrote:
>> This may sound silly, but how can I get the ascii value of a
>> character. In other words I'm looking for the reverse of char().
The reverse of Char() would be Integer(). Because Char is a type,
and Char() is a cast.
But you're not looking for a cast. You should have used Chr() in
the first place. Jan is right that Ord() is Chr()'s reverse. But
Ord() works on any ordinal type, not just Char. It also turns
Booleans and enumerations into numbers. Used on those types, the
functions it reverses don't exist and you are stuck with casts
(which work).
> Not sure if this helps but ord() returns the _ANSI_ value of a
> character. AFAIK ANSI is an extension of ASCII, so at least the
> basic values will correspond (a-z, A-Z, 0-9, etc.). Anyway, Delphi
> uses ANSI, so why do you want to convert to ASCII?
ASCII is 7-bit ASCII, ISO-646, containing Unicode code points zero
to 127. It _ends_ after that. Characters with ordinality 128-255
are _not_ ASCII. They may be Extended ASCII, but every manufacturer
has his own ideas about what the upper half of Extended ASCII looks
like. Often even several ideas per manufacturer.
What really comes after 7-bit ASCII is ISO-8859-1, containing Unicode's
first 256 code points. Its second half, like ASCII, starts with 32
non-graphic control characters, but these are sufficiently useless in
Windows environments that Microsoft replaced them with other characters
from higher up in the Unicode table for their windows-1252 codepage.
The Euro sign and the infamous smart quotes are in windows-1252, not in
iso-8859-1.
I _think_ that when Windows says "ANSI", they really mean the current
codepage for the SBCS subsystem of Windows, and that's usually (here)
windows-1252. But as close as Greece, Tunesia, or Russia, it's
probably not. And the codepages for those alphabets don't match
iso-8859-1 as closely. Up to and including 127, though, they are all
identical.
My point? Converting from ANSI to ASCII is simple: just throw out what
is over 127; and it can be useful because people may have different
opinions about what character 254 is. I may claim it's U+00FD LATIN
SMALL LETTER THORN, cousin Nikos may persist in his opinion that it's
U+03CD GREEK SMALL LETTER OMEGA WITH TONOS, Touriya the baker's
daughter may tell you it's U+200F RIGHT-TO-LEFT MARK, and to other
people it may be U+20AB DONG SIGN, U+044E CYRILLIC SMALL LETTER YU,
or U+0163 LATIN SMALL LETTER T WITH CEDILLA... or it may not be a
valid character at all. And we'd _all_ be right. There are things to
be said for either using straight 7-bit ASCII, or skipping the many
incarnations of Extended ASCII and going to wide strings and Unicode.
Groetjes,
Maarten Wiltink
- Next message: Bernhard Mayer: "Re: form in dll"
- Previous message: Jan Kroeze: "Re: ASCII value of a character"
- In reply to: Jan Kroeze: "Re: ASCII value of a character"
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
Relevant Pages
|