Re: Get ASCII values for PC arrow keys?
From: jallan (jallan_at_smrtytrek.com)
Date: 12/20/03
- Next message: Leor Zolman: "Re: Iterate type?"
- Previous message: Joec: "Iterate type?"
- In reply to: osmium: "Re: Get ASCII values for PC arrow keys?"
- Next in thread: jallan: "Re: Get ASCII values for PC arrow keys?"
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
Date: 19 Dec 2003 17:10:01 -0800
"osmium" <r124c4u102@comcast.net> wrote in message news:<brqj6a$694oa$1@ID-179017.news.uni-berlin.de>...
> "You" was not a royal you. I meant the guy I was addressing. *He* was the
> one trying to clarify it. He is not entitled to do that. That doesn't
> preclude some standards body from doing so. I would hope that they would
> have the courtesy to change the identification when they change the
> document.
I've never heard of a "royal you".
When asked, those responsible for standards usually do attempt
clarification about the intent of difficult passages
Even without that, everyone's interpretation of the same standard is
not equal any more than everyone's interpretation of the US
Constitution is equal. (I've seen some very strange interpretations on
issues in the Constitution, as probably you have also.)
> I think I skimmed all your references and I couldn't find what I was taking
> about. For example: ASCII contains provisions for transmitting binary
> data, SO and SI were involved. Shift out and shift in. This bothered me
> because I could never figure out how it was supposed to work.
ASCII is a character set, not a standard. Accordingly it contains no
provisions of any kind. It might have contained characters to indicate
beginning and ending of binary data. But it doesn't.
SO and SI have nothing to do with binary data.
Standards on control codes and standards on graphic characters were
quite reasonably separated in ECMA and the resulting ISO standards as
being totally different in intent. It was envisioned that users could
shift the graphic characters to another set of characters without also
changing the control characters or could change to an alternate set of
control characters without changing the graphic characters.
However, despite ANSI and ISO wishing to charge for their copyright
standards, sufficent portions of these standards must appear in the
ISO/IEC registry for character sets for them to receive identifying
escape sequences.
See http://www.itscj.ipsj.or.jp/ISO-IR/ for the registry.
For the standard ASCII control code definitions within that site see
http://www.itscj.ipsj.or.jp/ISO-IR/. This explains SO and SI.
More complete explanations of these and other recommended shift
functions can be found at
http://www.ecma-international.org/publications/files/ECMA-ST/Ecma-035.pdf.
Essentially, in a seven-bit environment, SI should display all
following characters as though they were characters displayed by same
bytes with the eighth bit set in an 8-bit environmnt. SO turns off SI.
A complete listing and explanation of other control characters,
ISO/ANSI escape sequences and ISO/ANSI control sequences is found at
http://www.ecma-international.org/publications/files/ECMA-ST/Ecma-048.pdf.
Of course only the seven-bit control characters are part of the ASCII
set. The rest constitute an additional protocol built on top of the
control codes of ASCII and are not part of ASCII itself.
The best general site for links to character set information is
http://www.i18nguy.com/unicode/codepages.html
> What prevents
> an accidental SI in binary data?
SI has nothing to do with binary data.
> In any event, the scheme was not used
> widely, if at all.
The entire ISO 2022 system proved too complex and required more
keeping track of state then felt reasonable. But large parts of it
were actually implemented in console applications on DEC equipment and
on the Commodore Amiga. Parts have been used in implementations of
eastern character sets, especially under Unix-type systems. If you
want you can find "ANSI terminal" emulations on the web and play
around with them.
But Microsoft and Apple mostly ignored these escape sequences and
control sequences. A very small number of these ISO/ANSI control
sequences and escape sequences can be implemented in an MS-DOS console
by running the application ANSI.SYS.
Also it became notorious that character sets mostly weren't registered
with ISO/IEC. People didn't bother. In any case PC character sets and
Macintosh character sets and Windows character sets (and many other
micro-computer character sets) broke ISO/ANSI rules about how
character sets were supposed to be constructed and accordingly could
not be properly shifted into and shifted out of.
> So we have MIME et al to do that work. The *point* is
> that the letters SO and SI *have no intrinsic meaning*. They must be
> defined and described. This does not fit on a single *** of paper, the
> thing most people think is ASCII.
The thing that most people think is ASCII, the thing that is ASCII,
does fit into two charts on the web: one for the control characters
with a short description of use for each and one for the graphic
characters with a default image for each one. You could combine it
into one chart. It might take two or even three sheets of paper
printed at a reasonable size. But it is not very complex. I don't
think it any more complex for control character then what most people
would expect: a simple explanation for each control character telling
what it should do.
> I do see fragments of the document in
> some of your links, CR without LF so overprinting works, for example. But I
> can't find all of it, just bits and pieces.
There are cross references to other ECMA standards, all available at
http://www.ecma-international.org/publications/
> I really don't know what you have in mind here. You DO have to take the
> paragraph as a whole though, what I wrote is not written well enough so that
> the individual sentences can stand by themselves. Is there some subtlety in
> the word "standards" itself? That the US can not ordain standards? I note
> that you seem to be from Canada, is this a factor?
Of course not. Anyone can ordain a standard. The difficulty is in
getting people to follow that particular standard without deviation.
> > It was one of 33 FIPS withdrawn in 1997 "because they are obsolete, or
> > have not been updated to adopt current voluntary industry standards."
> > See http://www.itl.nist.gov/fipspubs/33fipwd.htm
>
> Please tell me which of the 33 line items identifies the revocation of ASCII
> code. I looked and looked and didn't see it.
The first one.
> Please note that ASCII was *NOT* a voluntary standard, it was crammed down
> the throat of the industry.
Not true. It began in the U.S. under the auspices of E.I.A.
(Electronic Industries Associations) to solve the problem of
communication interchange between various different telecommunication
standards. Individual corporations could still keep their own
standards if they wished (how could one stop them and why would one
want to?), but there was an obvious cost benefit to having one general
standard to which everyone could translate to and from rather than
each corporation having to deal separately with everyone else's
standard.
The X3 Committe under the auspices of the A.S.A. (American Standards
Association) came from this. It developed various standards for the
data processing industry including coding and media standards and so
forth.
ASCII was created by normal standards procedure with votes by
representatives appointed by various companies and other interested
parties.
No company was ever *forced* to use ASCII (except by market-place
pressure) any more than any company is *forced* to use or market
ANSI-c rather than some other proprietary variety of c or any other
language. ASCII was no more or less *forced* on the industry than was
ANSI-c or Posex.
The mess of various substandard and hacked-up ASCII sets that appeared
in practice are enough to show lack of any policing. Commodore had
their own ASCII extensions added to 1963 ASCII and kept to them,
ignoring all changes in later official ASCII. Less than ten years ago
Xerox was still providing their high-speed printers with default fonts
in which some of the symbols did not match official ASCII.
The companies called these things ASCII.
Differences today between systems on the meaning of carriage return
and linefeed and on what character to use for end-of-file marker come
from various companies going their own way despite standards. In their
printer controls Epson used SO as a control code meaning double-width
printing and SI as a control code meaning compressed printing!
The c language used characters in non-invariant positions of ASCII.
(C-coding in most non-US national variants of ASCII was extremely ugly
as letters with diacritics or other symbols appeared in place of
square brackets and braces and in place of some of the operators.)
The dream of one set of control characters to be used in the same
fashion on *every* device (that could support them) didn't happen. So
we now instead we have Richtext and PDF and TeX and SGML and
HTML/XML/XHTML.
Even graphic characters were not the same on every system and on every
device (without considering the various offical national variants of
ASCII). So we now have Unicode which is much simpler than shifting in
and out of many different character sets. Searching and cutting and
pasting becomes far easier when one doesn't have to always search back
in case some kind of shift of character set from the default set has
occurred
> IBM said "go to hell" and the government of the
> United States blinked.
IBM was designing System/360 which first introduced 8-bit
architecture. An 8-bit character set was an obvious enhancement to
include. But the ASCII discussions were dragging on and on and on.
Accordingly IBM quite reasonably developed their own encoding. They
intentionally used the same characters that the X3.2 committee had
decided on. But the X3.2 subcommittee changed their minds about some
characters so there were some differences in the end.
However IBM retained representives on the US committees and on the
later international ISO/TC97/SC2 body which essentially took over the
later development of ASCII from the US.
Honeywell also used their own encoding. Probably others did also
though their sets are now forgotten. ASCII, once introduced, was
obviously the way of the future. Why re-invent the wheel?
Instead ASCII was mofified and extended in various implementations.
Then that also cried out for standardization with various 8-bit
encodings and multi-byte encodings.
The standard Microsoft Windows character set is an expansion of
ISO/ECMA Latin-1 which itself conflicts with the US definition of
ASCII in that the most recent US standard defining 7-bit US-ASCII
(ANSI X3.4-1986) still defines 0x27 as "apostrophe (closing single
quotation mark; acute accent)" and 0x60 is Opening Single Quotation
Mark, with Grave Accent only a secondary meaning.
But almost all fonts and extended character sets now follow the ISO
646 standard where 0x27 is a straight typewriter apostrophe (with
acute accent being a separate character among the extended characters)
and 0x60 is a grave accent. MacRoman does this also.
Unix systems held out for a long time, but newer X-Windows fonts have
finally made the switch to following the ISO 646 and ISO/IEC 8859
standards properly instead of mixing the older ASCII use and the ISO
usage. (Some Unix/Linux users are outraged.)
Jim Allan
- Next message: Leor Zolman: "Re: Iterate type?"
- Previous message: Joec: "Iterate type?"
- In reply to: osmium: "Re: Get ASCII values for PC arrow keys?"
- Next in thread: jallan: "Re: Get ASCII values for PC arrow keys?"
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]