Re: Fortran decimal anyone?
From: John H. Lindsay (jlin_DELETE_THIS_SPAM_ZOT_dsay_at_kingston.net)
Date: 05/13/04
- Next message: John Harper: "Re: Is intrinsic MIN inefficient?"
- Previous message: Ken Plotkin: "Re: Migration form frotran powerstation to g77"
- In reply to: David Frank: "Re: Fortran decimal anyone?"
- Next in thread: Mike Cowlishaw: "Re: Fortran decimal anyone?"
- Reply: Mike Cowlishaw: "Re: Fortran decimal anyone?"
- Reply: Dave Thompson: "Re: Fortran decimal anyone?"
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
Date: Wed, 12 May 2004 20:19:10 -0400
Hi Fortranners:
While looking at the problem of supporting packed and unpacked
decimal data strings (and at the same time the problems in doing
arbitrary precision integer and fixed point decimal and fixed
point binary arithmetic) in SNOBOL4 under OS/2, I made a list of
the internal representations of fixed point decimal data that I
had seen on Intel hardware, some immediate extensions thereof and
alternates thereto. Some of them were inherited from (I.B.M.,
Burroughs or Honeywell) (mainframe hardware, PL/Is or COBOLs). I
think the list gives an understanding of the potentially huge
size of the 'simple' problem of fixed point decimal data in
FORTRAN. Plainly, since such data exists in files which
Fortranners want to, even need to, process, a means to access,
use, and produce such data in files is needed. BTW, I'm speaking
of data layouts where the data is _not_ 'DISPLAY' (COBOL term,
one digit &c. per byte, for printing or for other _visual_
reading), but rather 'COMPUTATIONAL', for calculation either by
hardware or by carefully optimized in-line generated code or
library software (whether visually readable or not).
All the decimal data field formats I have seen were of fixed
length and of fixed (assumed) position within their data records;
no terminating or attached/embedded field length controlling
characters or subfields were seen. Similarly, while I have seen
cases where there were decimal point indicators (".", or "," in
some countries) existed in fields I would want to call
COMPUTATIONAL, in such cases the decimal points were always in a
fixed position. Where the decimal point indicator could float in
the field, because of the size of the code required to handle the
data, I'd want to call it DISPLAY. I also regard 'BLANK WHEN
ZERO' fields as DISPLAY.
To repeat a bit, I'm dealing here with the _internal_ machine
format of the data as might be seen in a hexadecimal core dump,
and handled in assembler or a very machine-oriented language.
One Decimal Digit per Byte (Unpacked) Data Formats.
--------------------------------------------------
The digits may be arranged within a data field as big-endian or
small-endian (2 options).
The digit characters seen were either ASCII or EBCDIC (printable)
characters or 1-byte binary representations of 0 through 9 - i.e.
0x00 or 00h through 0x09 or 09h as in assembler source (3
options).
In no cases seen, was the possibility allowed for that 2 or more
sign characters or indications could be in any one field.
Some data formats allowed leading blanks for 0's, and some did
not. Some allowed trailing blanks and some did not (4 options,
but see below where leading and/or trailing blanks are accounted
for as they may occur with each of the other possibilities).
I'm treating the cases where a trailing DB, _DB , CR or _CR,
(where the '_' represents a blank character) is allowed as a
negative sign as 'DISPLAY', even though some COBOLs and PL/I
allowed doing arithmetic with such things; I haven't seen any
hardware that did that directly. Similarly, I'm treating fields
with comma, decimal point and embedded blank characters (as
thousands, millions, billions, ... separators, but not as an
actual decimal point indicator), whether leading, trailing or
embedded, as DISPLAY, not as COMPUTATIONAL.
No cases of hardware support of biased decimal data were seen,
nor were cases of hardware support of range limits for decimal
data (other than simple digit and sign capacity of the field).
The sign characters seen were either
(1) Never present (data always assumed positive in cases
seen; with leading/trailing blanks 4 options).
(2) Always present (even if positive).
(3) Optional (absent => positive was the only usage seen in
this case).
In some cases, the sign indication, if present, was a separate
character -- a minus sign ('-') or a plus sign ('+') in the
system standard character set were the only ones seen. In
others, the sign indication, if present, was combined in a byte
with a digit indication -- an 'overpunched digit' character.
Typically this was A to J for +0 to +9, and K to T for -0 to -9
(EBCDIC). The concept of 'overpunched blank' was not seen.
In the case of (2) or (3) above, the sign code was either
(a) A leading separate character as the first character of
the field (blanks may or may not occur between the sign
and digits, and may or may not after them - 4 options).
(b) A leading separate character as in (a), but following
any leading blanks and preceeding any other digits (same
4 options).
(c) A trailing separate character (same 4 options).
(d) An embedded separate character in a fixed position (cf.
PL/I picture format character J ; same 4 options).
(e) A floating separate character, but always following any
leading blanks and preceeding any trailing blanks (same
4 options).
(f) A leading 'overpunched digit' character as the first
character of the field (leading blanks not possible in
this case; with trailing blanks, 2 options).
(g) A leading 'overpunched digit' character (following any
leading blanks; the 4 options).
(h) A trailing 'overpunched digit' character (the 4 options).
(i) An embedded 'overpunched digit' character in a fixed
position (the 4 options).
(j) An 'overpunched digit' where the sign could float to any
digit in the field (the 4 options).
--- (38 options)
No case was seen where a separate sign character could float
among leading or trailing blanks, and no case was seen where a
separate sign character occurred in a 'fixed' position other than
as the first character of the field or immediately before the
digit characters.
For the unpacked data forms, this gives 2 x 3 x (4 + 38 + 38) =
480 cases.
Two Decimal Digits per Byte (Packed) Data Formats.
-------------------------------------------------
In the cases I've seen, a decimal digit was represented by a hex
digit, and as there are 16 hex digits (0 through F), the other 6
hex digits were used somehow as a sign. I haven't seen any use
of the other 6 hex digits (A through F) as anything other than a
sign.
The digits may be either big endian or little endian within a
byte (2 options).
The bytes may be either big endian or little endian within a
field (2 options).
A sign may be
(i) Never present (data always assumed positive in cases
seen)
(ii) Always present (even if positive).
(iii) Optional (absent => positive was the only usage seen in
this case).
In the case of (ii) and (iii), the sign could be
(I) Leading.
(II) Trailing.
(III) Embedded at a fixed location in the field.
(IV) Floating within the field.
This gives a maximum of 2 x 2 x 3 x (1 + 4 + 4) = 108 cases.
Plainly, using this whole scheme of possibilities is not
reasonable, and any one machine or implementation of a language
on a particular piece of hardware that I've seen uses only a
small subset of the possibilities. Choosing a subset suitable
for a particular machine or implementation of a language is no
simple job if one tries to be compatible with a large number of
the data forms existing in files.
Even the attempt to simplify the problem by allowing the
conversion of the above forms to and from a common form for doing
arithmentic, and doing the arithmetic in that form, is big enough
(and probably slower than we Fortranners would like to call
reasonable).
John.
-- John H. Lindsay jlin_DELETE_THIS_SPAM_ZOT_dsay@kingston.net 48 Fairway Hill Crescent, Kingston, Ontario, Canada, K7M 2B4.
- Next message: John Harper: "Re: Is intrinsic MIN inefficient?"
- Previous message: Ken Plotkin: "Re: Migration form frotran powerstation to g77"
- In reply to: David Frank: "Re: Fortran decimal anyone?"
- Next in thread: Mike Cowlishaw: "Re: Fortran decimal anyone?"
- Reply: Mike Cowlishaw: "Re: Fortran decimal anyone?"
- Reply: Dave Thompson: "Re: Fortran decimal anyone?"
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
Relevant Pages
|