Re: Seed7 (was: Program compression)
- From: thomas.mertes@xxxxxx
- Date: Fri, 11 Jul 2008 03:49:28 -0700 (PDT)
On 6 Jul., 09:36, jaycx2.3.calrob...@xxxxxxxxxxxxxxxxxxxxxx (Robert
Maas, http://tinyurl.com/uh3t) wrote:
Anyway, thank you for the feedback.Date: Thu, 26 Jun 2008 14:21:31 -0700 (PDT)
Why this response is so belated:
Actually the statements are added during the parsing process.From: thomas.mer...@xxxxxx
Are there standard packages available to provide these asIt is not the intention of Seed7 that everybody re-invents the
"givens" with a well-documented API so that different application
programmers can read each other's code?
wheel. There is a well-documented API. The predefined statements of
Seed7 are described here:
Given that such statements aren't in the *core* of the language,
but are added later when the library containing their definitions
is loaded ...
(possibly when building the executable that is saved onActually there are the functions 'parseFile' and 'parseStri' which
the disk to avoid the cost of loading the library again each time
the executable is run):
- Does Seed7 include a parser that reads Seed7 source-code syntax
(from an input stream such as from a text file, or from the
contents of a string) and produces a parse tree (as a pointy
can be used to parse Seed7 source programs into a values of the
type 'program'. I just added a short description about the type
'program' to the manual. See:
Note that I try to handle the program currently executed and the
program which was parsed into a 'program' variable to be separate.
It is possible to execute 'program' values and to request the code
of a 'program' value in a structured form. Actually the Seed7 to C
compiler uses this feature to generate the C code. Currently the
features of this reflection are designed to make them usable for the
It is not my intend to support programs which manipulaten their own
code as it is done with "self modifying code".
- If so, does this parser automatically get enhanced whenever a newYes. This is something happening during the parsing process.
statement type is defined in some library that was loaded, so
that statements defined in the library can now be parsed?
Loading a library at runtime as a way to introduce new statements
for the program which is currently running IMHO makes no sense.
- If so, is there also an inverse function that prettyprints from aDuring the parsing some information, such as whitespace and comments
parse tree back out to textual source-code syntax?
are lost. Some information about the position of an expression
is maintained. Generally I think that such a prettyprinter would be
- If so, does that prettyprinter function also get automaticallyI have nothing done in this direction, but I think that that should
enhanced whenever a new statement type is defined, so that
statements of that new type can be printed out meaningfully?
be possible. It might me necessary to extend the the reflection, if
some functionality necessary for prettyprinting, is missing.
I ask because in Lisp it's almost trivial to write a utility thatSuch things are planned for Seed7. I have started with a program
reads in source code, analyzes it for some properties such as
undeclared free variables or functions etc., in order to build a
master cross-reference listing for an entire project and to flag
globally undefined functions, and also it's trivial to write code
that writes code and then either executes it with the same
core-image or prettyprints it to a file to be compiled and/or
loaded later. So I wonder if Seed7 provides the primitives needed
to make the same kinds of tasks equally easy for Seed7 sourcecode.
which generates html documentation including a source file where
every use of a function is linked to its definition. The program
(doc7) works to some degree, but not good enough to release it.
This function was added short ago to be helpful for the I4 P-CodeOther types and their functions (methods) are described here:
| boolean conv A Conversion to boolean
| ( Type of argument A: integer,
| boolean conv 0 => FALSE,
| boolean conv 1 => TRUE )
Is the behaviour defined for other values given? Does it throw an
exception, or are compiler writers free to treat other integers any
way they feel, resulting in code that produces different results
under different implementations? The Common Lisp spec is careful to
have undefined behaviour only in cases where the cost of
prescribing the behaviour would have a good chance of greatly
increasing the cost of implementation. Is such the case here?
interpreter which would be used for the P4 Pascal compiler.
Yes, I transfered this classic Pascal compiler to Seed7...
The I4 interpreter needs cheap functions to transfer all its
basic types like boolean, float, char, set to and from integer.
The Pascal version of the I4 interpreter uses an unchecked
cased record (which is equivalent to a C union). Therefore I
introduced this function. It was a mistake to add this function
to the documentation, since it is currently only experimental.
BTW it works like the odd function.
Yes, but the introduction of 'boolean conv' was just experimental.Again, these are such basic container types that they really oughtA short explanation of Seed7 container types is here:
to be provided in a standard package. Are they?
| ord(A) Ordinal number
| ( Type of result: integer,
| ord(FALSE) => 0, ord(TRUE) => 1 )
So this is just the inverse of boolean conv?
The function 'odd(integer)' is the function to be preferred
to convert an integer into a boolean.
| succ(A) SuccessorSuch functions are present to be usable in generic code. That way
| ( succ(FALSE) => TRUE,
| succ(TRUE) => EXCEPTION RANGE_ERROR )
| pred(A) Predecessor
| ( pred(FALSE) => EXCEPTION RANGE_ERROR )
| pred(TRUE) => FALSE )
Why even bother, unless this is a hackish way of conditionally
signalling an exception?
the generic code can assume that the 'succ' function is present. The
'incr(A)' function, which is just a shortcut for 'A := succ(A)', is
also present just for this purpose. An example of a template
function which uses 'incr' is here:
| rand(A, B) Random value in the range [A, B]Uniform. I added an explanation to the documentation.
| ( rand(A, B) returns a random value such that
| A <= rand(A, B) and rand(A, B) <= B holds.
| rand(A, A) => A,
| rand(TRUE, FALSE) => EXCEPTION RANGE_ERROR )
What distribution within that range, uniform or what?
What sorts of datatypes are allowed for A and B?There is a general rule to keep the descriptions short. This rule
Do they have to be of the same datatype, or can they be unrelated?
rand(3,9.5) rand(4.7SinglePrecision, 9.7DoublePrecision)
rand("FALSE",4.3) rand(FALSE,"TRUE") rand(TRUE,3)
Which if any of those expressions are conforming to the spec?
can be found at the beginning of the chapter "PREDEFINED TYPES":
The operators have, when not stated otherwise, the type described
in the subchapter as parameter type and result type.
This is a bug in the spec. I have corrected the sentence to:Note that the 'and' and 'or' operators do not work correct when
side effects appear in the right operand.
What is that supposed to mean??? If you use an OR expression to
execute the right side only if the left side is false, what
happens? I would expect what I said to happen, but the spec says
that doesn't work correctly? So what really happens?
Note that this early termination behaviour of the 'and' and 'or'
operators also has an influence when the right operand has side
(or (integerp x) (error "X isn't an integer")) ;Lisp equivalentAFAIK many languages such as C, C++ and Java have this behaviour.
The result an 'integer' operation is undefined when it overflows.
I would like to raise exceptions in such a case, but as long
as there is no portable support for that in C, Posix or some
other common standard, it would be hard to support it with
Does Seed7 provide any way to performYes.
unlimited-size integer arithmetic, such as would be useful to
perform cryptographic algorithms based on products of large prime
numbers, where the larger the primes are the more secure the
cryptographic system is?
| ! FaktorialThis is how a word looks like when it is not translated correctly.
Is that how the word is spelled somewhere in the English-speaking world?
Thank you for pointing this out.
| div Integer division truncated towards zeroCurrently not, but it is not hard to add such a thing.
| ( A div B => trunc(flt(A) / flt(B)),
| A div 0 => EXCEPTION NUMERIC_ERROR )
| rem Reminder of integer division div
| ( A rem B => A - (A div B) * B,
| A rem 0 => EXCEPTION NUMERIC_ERROR )
Most CPUs, or software long-division procedures, compute quotient
and remainder simultaneously.
Is there any way for a Seed7 program to get them together?
It's wasteful to throw away the remainder then need to multiply theI guess that a good optimizing compiler can recognize the situation
quotient by the divisor and subtract to generate a copy of the
remainder that was thrown away a moment earlier.
when 'a div b' and 'a rem b' are computed close together without
changing a or b in between. Since Seed7 is compiled to C, I think
that I can rely on the C compiler to do this optimisation.
(multiple-value-setq (q r) (floor a b)) ;Can do it in LispIn the general case 'a ** -1' does not have an integer result.
| ** Power
| ( A ** B is okay for B >= 0,
| A ** 0 => 1,
| 1 ** B => 1,
| A ** -1 => EXCEPTION NUMERIC_ERROR )
So -1 ** -1 is required by the spec to signal an exception, instead
of giving the mathematically correct result of -1?
AFAIK Ada also does it that way.
BTW: The type float also has exponentiation operators defined.
While 0 ** 0 which is mathematically undefined is *required* to return 1?This behaviour is borrowed from FORTRAN, Ada and some other
programming languages which support exponentiation.
I'm too tired to proofread your spec any further.That's the reason the result is of type:
| flip(A) Deliver a hash with keys and values flipped
| ( Type of result: hash [baseType] array keyType )
How is that possible if the table isn't a 1-1 mapping??
hash [baseType] array keyType.
The values in the hash tables are arrays with keyType elements.
Wrong. I use the word template to describe a function which isWhat precisely do you mean by "templates"?What computer scientists mean when they speak about "templates"
is explained here:
I.e. exactly what C++ implements, everything else is different and
not the same thing and substancard compared to C++ templates,
executed at compile time and declares some things while executing
(at compile time). For example: The function 'FOR_DECLS' is used to
declare for loops. FOR_DECLS gets a type as parameter and declares
a for loop for that type. This is explained here:
As you can see it is necessary to call template functions explicit.
They are not invoked implicit as the C++ template functions.
IMHO this explicit calls of template functions make the program
easier to read. Maybe I should add something to the FAQ.
Pointers are something else. If you are using pointers you are] Although functions can return arbitrary complex values (e.g. arrays ofI have improved the FAQ for this. See:
] structures with string elements) the memory allocated for all
] intermediate results is freed automatically without the help of a
] garbage collector.
| Memory used by local variables and parameters is automatically freed
| when leaving a function.
That doesn't makes sense to me. Suppose there's a function that has
a local variable pointing ...
responsible to manage that they point to reasonable data.
The automatic freeing of local variables has exceptions (sorry
I will add an explanation to the FAQ). The values referred by
pointers and the values refered by interface types are not managed
to an empty collection-object (set, list,I see. We have a cultural misunderstanding.
array, hashtable, etc.) allocated elsewhere. Now the function runs
a loop that adds some additional data to that collection-object, so
the object is now larger than before. Now the function returns. How
much of that collection-object is "memory used by local variables
and parameters" hence "automatically freed when leaving a
function", and how much of that collection-object is *not* such and
hence *not* automatically freed upon leaving the function?
Lets say the collection is an array.
What you were suggesting is a collection declared with:
array ptr myData
which means that the collection contains pointers to myData
(some structure). In this case you are right and automatic
management is not possible (at least in the sense I talked about).
What I have in my mind when talking about automatic managed memory
is a collection declared with:
In this case the collection contains (copies of) the actual data.
When such a collection is freed it can also free its content
since it owns it. And this are the things which done automatically
in a stack oriented manner.
If you use pointer structures for everything you are right that
a GC or a manually managed heap is necessary. In Seed7 many things
can be done with abstract datatypes.
If abstract datatypes are used in an efficient way there is not
so much need to use pointers in Seed7 is not so high as in some
The return variable is excluded from this mechanism.Debug-use case: A application is started. A sematic error (fileYou are much too deep in the Lisp way of thinking. If a stack
missing for example) throws user into break package. User fixes the
problem, but saves a pointer to some structure in a global for
shrinks the elements popped from the top just don't exist any more.
So it's impossible in Seed7 to have a function create an object and
return it, because anything created by a function doesn't exist any
more after return?
[snip reasons why not all data should be stack oriented]
It is also a common bug in C/C++ and other similar languages when
a function returns a pointer to some local data (which is at the
Are you saying that in Seed7 it's impossible to have any data that
is not on the stack, hence it's impossible for any function to
allocate memory for some object and then *return* a pointer to that
object so that the caller can later work with that object?
If you want such data to be available later you need to make a
I agree that some data cannot be managed in a stack oriented way.
I think that compile-time type checking can find bugs which;Static type checking fails to detect this type-mismatch undefined-method error.... Every bug found at compile-time will not make you trouble at
run-time. The earlier you can eliminate bugs the better.
Why do you feel the need to have two different times, one where
static stuff is checked but you have no idea what's really
happening, and then one where things actually happen?
slip through the fingers when you test your program.
IMHO even a test with 100% code coverage is not sufficient since
the combination of all places where values are generated and
all places where this values are used must be taken into account.
I have aggain improved the FAQ to contain this argumentation:
I suppose inNo. There is the type 'program' which can be used for that.
Seed7 it's impossible to have an interactive loop where you can
type in a line of code and it *immediately* does something and you
*immediately* see whether it did what you expected it to do instead
of needing to wait until the whole program is compiled before you
can see what that one line of code did?
BTW the Seed7 parser usually processes 200000 lines per second.
When there is demand, such a function can be added.... There is also the type 'bigInteger' which serves as
unlimited-size signed integer. The type 'bigInteger' is explained
| div Integer division truncated towards zero
| ( A div B => trunc(A / B),
| A div 0_ => EXCEPTION NUMERIC_ERROR )
| rem Reminder of integer division div
| ( A rem B => A - (A div B) * B,
| A rem 0_ => EXCEPTION NUMERIC_ERROR )
For bigIntegers, it's especially painful not to have division with
both quotient and remainder directly returned. It takes a lot of
extra time to multiply the quotient by the divisor (then subtract
that from the original value) to get back the remainder after
having thrown away the remainder in the first place.
Agree. Historically this where just character literal examples.What is the precise meaning of type 'char'?The 'char' values use the UTF-32 encoding, see:
| The type 'char' describes UNICODE characters. The 'char' values use
| the UTF-32 encoding. In the source file a character literal is written
| as UTF-8 UNICODE character enclosed in single quotes. For example:
| 'a' ' ' '\n' '!' '\\' '2' '"' '\"' '\''
That's not written well at all. It doesn't make sense.
Some of them use escape sequences, which were explained below.
Now you get the impression that this are UTF-8 literal
examples which was not the original intend.
UTF-8 coding of a single character, it's two different UTF-8This are escape sequences. They are explained in the next paragraph.
(US-ASCII subset thereof) characters which are a C convention for
Likewise \\ is two UTF-8 (US-ASCII) characters.
Likewise \" is two UTF-8 (US-ASCII) characters.
Likewise \' is two UTF-8 (US-ASCII) characters.
I have moved the character literal examples after the explanation
of escape sequences. That way you don't get the impression that
this is an explanation of UTF-8 literals.
I don't think you have any idea what UTF-8 really means.As I have implemented UTF-8 support for Seed7, I think I know
something about it.
Greetings Thomas Mertes
Seed7 Homepage: http://seed7.sourceforge.net
Seed7 - The extensible programming language: User defined statements
and operators, abstract data types, templates without special
syntax, OO with interfaces and multiple dispatch, statically typed,
interpreted or compiled, portable, runs under linux/unix/windows.
- Seed7 (was: Program compression)
- From: Robert Maas, http://tinyurl.com/uh3t
- Seed7 (was: Program compression)
- Prev by Date: Re: Which computer language program is best for undergrads?
- Next by Date: Re: beginner help with sequential and binary search
- Previous by thread: Seed7 (was: Program compression)
- Next by thread: how do I convert back a boost shared_ptr into a reference?