Re: The linf project



analyst41@xxxxxxxxxxx wrote:
2) Lexical structure next? Well, case should have no significance
except in character literals. Identifiers should have no a-priori
limit
in length. The bracketing delimiters {}, [], and () should all have
identical meanings. And so on. Settling all these issues would take
an afternoon (at least). Like: should keywords be reserved?

[a look and feel issue: periods ampersands, percent signs etc. in
contexts that give a slashqz feel to code would be extirpated.
single and double quotes should have the same functionality? ]
Well, you didn't answer the question about keywords. I would
oppose (for any language) having reserved keywords for reasons
I gave previously in the thread (or the other thread that this
continues). But the decision has consequences.

Now, what are the consequences of what answers you did
provide? I've never seen the word "slashqz" before. But
I presume you mean the C-like line noise simulation. While
it's true that cryptic notation should be avoided whenever
possible, there are no absolutes.

A language needs clarity, succinctness (without being terse),
and there are only so many characters to choose from. If you
leave out features because *you* think their notation is not
aesthetic, you will also be leaving out users that need that
feature. If you design-out what you consider the inaesthetic
notation, the feature is likely to be verbose and *less*
legible - and, again, you will leave out users that need the
feature frequently. Language design is an exercise in
compromise.

Hmm. No ampersands? How do you do continuation? And
don't say column 6. That's seriously anti-productive no matter
how large or small programs are, no matter how fast or slow
you develop them, and no matter how long you intend them to
last.

Of course, especially if there is no arbitrary line length limit,
you *could* say that no continuation is allowed. I doubt you
would get much agreement.

For a simple language (and, I believe, for *any* language)
continuation should be explicit and there should be only one
syntax for it. (There are productivity experiments that
demonstrate that line-wrapping instead of explicit continuation
is bad for productivity. And yes, even for short, quickly
written programs.)

And, continuation shouldn't be allowed to break any tokens.

I think the best solution is the only one that Fortran
explicitly prohibits: the continuation character should
be the first non-blank character of the continuing line,
not the last non-comment character of the continued line.
And the continuation character should be required on every
line (after the initial line), even otherwise blank lines,
or lines with only comments. Oh, and there should be no
arbitrary limit on how many lines a statement can span.
(As another aside, if two lines both begin with a continuation
character, they must both be part of the same continued
statement. Ending one statement and beginning another
incomplete one on the same line is a nasty practice
that the language can easily just prohibit.)

Now, what should that continuation character be? I like the
ampersand. It's a ligature for the Latin word et. Using it
to denote continuation is closer to the usual meaning of et
(in Latin sayings I've been exposed to anyway - I don't pretend
I know Latin) than, say, using it to mean logical AND.
I mentioned before that your character set is limited. If
you are really serious about developing a language, I'm
sure you'll find yourself staring at the summary card for
your character set and wishing there were more (and that
they more closely fit the meanings you wish to ascribe to
them). Continuation is probably the least cryptic use of
ampersand you are likely to find. You can even put it in
column 6 if you like.

Hmm. No cryptic periods? OK that leaves .TRUE. and
..FALSE. out of the language. You don't think that's cryptic?
You've been indoctrinated. In most languages that have
enumerated types (yes, that's what LOGICAL is) the literals
are just ordinary identifiers. I think that enumeration literals
actually *should* be distinct from identifiers. I may be the
odd man out. Whether delimiting with periods is the best
way to make them distinctive is hard to judge (I'm indoctrinated
too).

And, you can't have .AND., .LE., etc. Well, I think the usual
symbolic operators are preferable anyway. I haven't seen any
actual experiments proving it, but it seems right. For AND, OR,
NOT, and NEQV (XOR), I prefer /\, \/, ¬ (~), and ><. And since
you prefer leaving out user defined operators (with the loss of
users that like them), I guess you can to without periods as
delimiters.

Hmm. No cryptic percent? OK without delimited enumeration
literals or operators you can use period. Oh, but you said
you didn't want *any* derived types. OK, so you do without
a quite large segement of your potential users.


3) Intrinsic data types? Floats (at least all the types supported by
the host hardware), integer (one type, arbitrary precision),
rationals
(arbitrary precision), character (at least ASCII, Latin-1, and some
UNICODE - I'd personally prefer UTF-16), and Logical (one type).
No KINDs, all these are distinct types. (No parameterized types
at all: you did posit it had to be a simple language.) Other
intrinsics
might be desirable (bits say). Depends on details elsewhere.

[everything except integer double (do you really need single float
anymore) and basic character types should be realized through
libraries.]

Well, I didn't mention "integer double". Perhaps you meant
"integer, double, and basic character types"? In any case, how
does'someone realize a new data type through libraries. I've
never seen that done. If you mean the clumsy way we used to
simulate derived types with integer arrays and type-cheats,
that clearly *not* a recipe for improving productivity, whether
the program is rapidly prototyped or written for the long haul.

5) External procedures, module procedures, or internal procedures?
A simple language should have only one of these features. Modules
are a cleaner solution than externals and provide a way to share
some things without making them public. As for compilation cascades:
well such a language is involved with small programs anyway (and
may even be implemented interpretively), so it's not really an issue.
And again, you will probably want to save an reuse *some* codes:
modules provide good packaging.

[external procs?

modules only for data?]

[The language will not attempt to provide safety by itself. All
facilities for safe coding will be provided and it would be the
responsibility of the programmer to discern teh safe features and only
use them.]

I agree that features that decrease the expressiveness of a language
merely for the elusive goal of safety are to be avoided. However,
putting procedures into modules *doesn't* decrease expressiveness
(in fact, by allowing the sharingof PRIVATE things, your expressive-
ness is increased). Adopting less expressive features simply to
make the language less safe seems lose/lose to me.

6) Derived data types, yes or no? I think yes. If you need the
ability,
there's no useful alternative. And if you don't need it, you won't
notice it's there. It's not like it's something you can accidentally
trip over. If you have this, then COMPLEX (which I didn't mention
as an intrinsic) should be a predefined one of these. This obviously
involves the introduction of overloaded operators and intrinsic
functions.

[does this include structures? structures immediately introduce
slashqzs and I would stay away.]

Ok, but I should warn you that you are probably not only in a
small minority, but very close to unique in your opposition
to user defined data types.


7)
[...]INTENT? Yes (IN or OUT - not
specifying should mean the same thing as INOUT, so you don't
actually need to be able to say INOUT). OPTIONAL? Maybe,
but with different rules than Fortran. Asynchronous, protected,
volatile, etc.? No. This is supposed to be a simple language.
[...]
[prissy spinster-aunt features such as IN INOUT etc. don't belong.
The language will allow for correct legible fast code to be written
but will not sacrifice its internal beauty to protect programmers from
themselves.]

Well, you are pretty much alone on this one too. Nice propaganda
though.

I don't have all to address what's likely to die on the vine anyway.
But:

(The above is not in any particular order of priority. It's just the
order the issues came to mind, and is not recommended as any
outline for further discussion. Indeed, since it's just off the
cuff,
there may be issues not on the list that, on reflection, I would
regard as more important than many of the things I mentioned.)

I think you need to take this to heart. An informally ordered
unsystematic presentation simply confuses. I suspect most
aren't reading this. You need to specify clearly, and in some
systematic order what your language's syntax, semantics, and
pragmatics are expected to be. And, since it's such a big subject,
you should probably stick to a single issue or feature per article,
not the whole language. I made an extensive list only to point
out how big the subject really was.

--
J. Giles

"I conclude that there are two ways of constructing a software
design: One way is to make it so simple that there are obviously
no deficiencies and the other way is to make it so complicated
that there are no obvious deficiencies." -- C. A. R. Hoare


.



Relevant Pages

  • Re: The linf project
    ... except in character literals. ... A language needs clarity, succinctness, ... you *could* say that no continuation is allowed. ... A simple language should have only one of these features. ...
    (comp.lang.fortran)
  • Re: A note on computing thugs and coding bums
    ... It would handle international characters if the execution character ... method I used in "Build Your Own .Net Language and Compiler". ... work areas and counting on Nul is an illusion. ...
    (comp.programming)
  • Re: what does "serialization" mean?
    ... Sorry eddie, but you're dead wrong there as usual. ... >>How about ASCII character 0xB0, ... > Totalitarians and Fascists are often self-appointed language police. ...
    (comp.programming)
  • Thunderbird bugs [was: lots of other topics]
    ... Question marks are very, very specific thing and has very, very specific cause - written down in my previous e-mail - or in my Outlook Express instruction (same issue in Thunderbird and OE): ... interest in non-ASCII character sets comes partly from the fact that I ... It's a problem because the web browser designers ... specify a language at the sending end and a preferred language at the ...
    (alt.usage.english)
  • Re: Integrating a new language in Tex
    ... What steps should I take to integrate a new language in Tex system?? ... Character set, output side. ... Together with allowable syllable and word length for hyphenation, ...
    (comp.text.tex)