Re: Fortran 77 parser
- From: "James Giles" <jamesgiles@xxxxxxxxxxxxxxxx>
- Date: Fri, 04 Apr 2008 23:57:56 GMT
*** Hendrickson wrote:
James Giles wrote:
Richard Maine wrote:
James Giles <jamesgiles@xxxxxxxxxxxxxxxx> wrote:
So, defining a simple concept: "Open context" means a symbolThe Hollerith part is a pain because you can't readily tell whether
that's not within parentheses and not in a string literal,
hollerith constant, or comment.
something is a Hollerith without parsing the statement, which is a
bit of a circularity issue. Plus the possibility that a quote or
parens might be part of a Hollerith makes the rest of it harder in
the presence of Hollerith also.
This is overstatement. To be sure, you do need to know a small
subset of the syntax rules of Fortran to find hollerith. But it's a
very small subset.
Yes, but. FORTRAN 77 code was rife with extensions. I generally
agree with you that Sales algorithm is a good thing. But, it needs to
be aware of things like
REAL*4 HENRY
when it strips out Hollerith. For good or bad, it's a provable
fact that "working" parsers tripped over that one. :(
Oh yes (I said I was working from memory). The origunal paper
on the F77 Sale's algorithm did have special cases for seeing open
context asterisks before any equal was seen. Also for slashes (some
issue related to a specific property of DATA statements as I recall).
The issue does't only arise with non-standard forms. The following
is conforming:
CHARACTER*4 HENRY
And yes, unlike my faulty memory, the published algorithm accounted
for it.
Similarly, things like
DOUBLE PRECISION NAME
were potentially ambiguous with compilers that allowed DOUBLE
as a keyword and also allowed names longer than 6 characters.
Sale's algorithm did use the kind of rule everyone else does: that
the longest part that matches some known keyword would all be
regarded as that keyword. So the above would have been recognized
as the DOUBLEPRECISION keyword followed by identifier NAME. If
you wanted a variable called PRECISION, you would have to have
written:
DOUBLE PRECISION PRECISION, NAME
or
DOUBLE NAME, PRECISION
In my experience people that use extensions (like abbreviated
keywords) usually accept the need to work around the seemingly
ambiguous cases they create.
It's OK to say that the parser should only accept standard conforming
code. But, if you're trying to develop a commercially useful
parser, you need to worry about the non-standard stuff.
Some of the non-standard stuff anyway. I've never even tried to
parse Fortran without accomodating hollerith (both within and
outside of formats). I've never even heard of compilers that
support just DOUBLE as a keyword, so I've never accomotated it.
As I describe above, I *could* do so. I just never thought of it.
--
J. Giles
"I conclude that there are two ways of constructing a software
design: One way is to make it so simple that there are obviously
no deficiencies and the other way is to make it so complicated
that there are no obvious deficiencies." -- C. A. R. Hoare
.
- Follow-Ups:
- Re: Fortran 77 parser
- From: Walter Spector
- Re: Fortran 77 parser
- References:
- Re: Fortran 77 parser
- From: Jon Harrop
- Re: Fortran 77 parser
- From: Tobias Burnus
- Re: Fortran 77 parser
- From: Jon Harrop
- Re: Fortran 77 parser
- From: glen herrmannsfeldt
- Re: Fortran 77 parser
- From: James Giles
- Re: Fortran 77 parser
- From: Richard Maine
- Re: Fortran 77 parser
- From: James Giles
- Re: Fortran 77 parser
- From: *** Hendrickson
- Re: Fortran 77 parser
- Prev by Date: Re: Fortran 77 parser
- Next by Date: Re: Fortran 77 parser
- Previous by thread: Re: Fortran 77 parser
- Next by thread: Re: Fortran 77 parser
- Index(es):