Side-effecting input FD logical record hazardous?



Gentlepeople,

I'll start with my conclusion reached from a recent round of debugging,
then recite some of the empirical findings supporting it, and finally
ask for confirmation/response by the group.

Here's the conclusion: Side-effecting (overwriting) any portion of the
logical record in the FD (File Descriptor) area for a variable-length
blocked file is hazardous, leading to unpredictable results such as
S0C7 and S002 Abends. Only the character positions up to the variable
length most recently READ should be considered as "belonging" to the
COBOL program. Characters beyond the length just READ, but still within
the maximum record length in the FD, belong to the Operating System
(COBOL internals or the like), should be considered "off limits" and
should not be modified.

Here's the scenario. IBM Enterprise COBOL 3.2.1. FD had two logical
records defined, as TRANSACTION-HEADER PIC X(121) and TRANSACTION-FULL
PIC X(8196). There were some "garbage" (semicolon) characters in the
input file I desired to clean up "on the fly", by performing an INSPECT
TRANSACTION-FULL REPLACING ALL ';' BY ',' immediately following the
READ TRANSACTION-FILE. Note that the READ was directly into the FD and
not a READ INTO. Also note that the "scope" of the INSPECT would have
been the full length of the 01-level record (8192 characters), and not
restricted to the actual length of the record just READ. The code did
not include an OCCURS DEPENDING ON, but the two differing lengths of
logical records should have caused the compiler to set up the file
description format as variable record lengths (V), not fixed length
(F).

The variable length records were typically a few thousand characters
long, not the full 8192 possible. Program execution showed
unpredictable results when this INSPECT was inserted into the source
and compiled. It would abend seemingly unpredictably, but reproducibly
in the same area of the input file, and a display trace of records read
showed discrepancies with what an ISPF view of the dataset showed. S0C7
sometimes occurred due to a record being read "in the middle" instead
of at the beginning-of-record; more often a QSAM routine would issue an
S002 Reason Code 4, symptomatic of a Record-Descriptor-Word (RDW)
corruption problem in the input dataset.

Amazingly :-), just where the program would abend in the input file
showed a dependency upon the replacement character in the INSPECT,
according to the replacement character's position in the EBCDIC
collating sequence! These abends were reproducible from execution to
execution.

As an experiment, I tried doing a READ INTO into a Working-Storage
area, doing the INSPECT in Working-Storage, and then MOVEing back to
the FD logical record. This also led to the Abends.

However, doing a READ INTO Working-Storage, performing the INSPECT, and
then modifying the program to refer to the Working-Storage record
rather than the FD fixed the problem.

A couple of paragraphs from the COBOL Programming Guide gives us some
hints:

"When you specify a READ INTO statement for a format-V file, the record
size read for that file is used in the MOVE statement generated by the
compiler. Consequently, you might not get the result you expect if the
record just read does not correspond to the level-01 record
description. All other rules of the MOVE statement apply. For example,
when you specify a MOVE statement for a format-V record read in by the
READ statement, the size of the record moved corresponds to its
level-01 record description.

"When you specify a READ statement for a format-V file followed by a
MOVE of the level-01 record, the actual record length is not used. The
program will attempt to move the number of bytes described by the
level-01 record description. __If this number exceeds the actual record
length and extends outside the area addressable by the program, results
are unpredictable__ (emphasis added by me)."

Hmmmm... if I understand this last statement correctly, it is stating
that character positions in the FD 01-level logical record that are
beyond the "actual length" most-recently read may reside outside the
area addressable by the program! My educated guess at this point is
that the "offending INSPECT" was trampling on some OS-dedicated area
containing pointers and such things needed by QSAM to keep track of
things.

I think what this speaks to is the risk associated with input
operations upon variable-length without taking proper safeguards with
actual-record length as tracked by the OCCURS DEPENDING ON variable.
And in general, it is hazardous to modify the 01-record in the FD of a
variable-length input file. Probably best not to do it at all. Always
do it in Working-Storage instead. I think I remember learning this once
about twenty-five years ago, and in my haste to find a quick fix,
forgot it since then. :-)

Comments, anyone?

Ken

.



Relevant Pages

  • Re: Side-effecting input FD logical record hazardous?
    ... ONE solution that would let you INSPECT in the "logical record area" is the ... logical record in the FD area for a variable-length ... Characters beyond the length just READ, ... input file I desired to clean up "on the fly", ...
    (comp.lang.cobol)
  • Re: basic python questions
    ... exceptions. ... Do you want the program to continue if you have no input file? ... create a string of all the characters that you consider as valid in ... the string. ...
    (comp.lang.python)
  • Re: for SImage use Func??
    ... how) to detect when the very last line in the input file is null. ... A file of line, as you shown you know, is a file where each is line strictly delimited in platform dependent manner.. ... If your requirements states that a line is a sequence of character terminated by a CR+LF or an LF or a CR, then, the file you need is a file of characters, so don't open these files as file of lines, better as file ...
    (comp.lang.ada)
  • Re: Input past end of file
    ... characters" it contains, and what you want to happen to them. ... >Open strFileName For Binary Access Read Shared As intInputFileHandler ... >So, if the input file has multiple lines, but sometime it can have binary ...
    (microsoft.public.access.modulesdaovba)
  • Re: postscript terminal - how to typeset a unicode character (correct syntax)?
    ... The help system says that one can request Unicode ... characters by setting "set encoding utf8". ... pass it to the postscript terminal by adding to the 'set term postscript' ... that is what has happened to your input file. ...
    (comp.graphics.apps.gnuplot)