Re: RFCAS - The San Data Flow and Messaging Engine
From: Richard Harter (cri_at_tiac.net)
Date: Thu, 19 Feb 2004 10:58:35 GMT
On 13 Feb 2004 10:24:54 GMT, firstname.lastname@example.org (Bill
>email@example.com (Richard Harter) wrote:
>> Agents are event driven. An agent may (but need not) have a
>> single backgound thread. In addition to the optional background
>> thread it can have an indefinitely large number of event handlers
>> called on units.
>It sounds like your agents are similar to the boxes of data flow diagram.
>(I'm thinking of the SSADM DFD, but I am aware that other DFD standard
>exist and they are all pretty much the same, notwithstanding the shape of
>the blobs that you write things in.)
That would be right.
Agents can have data and procs (procedures/functions) defined within
them, but these are "inside the black box" and are not externally
One suggestion that was made to me in email was that agents can have
subagents (boxes within boxes, wheels within wheels), making for a
more natural decomposition. An advantage of this idea is that
exception handling would be cleaner.
>> Agents communicate with each other in one of two ways, by pipes
>> or by messages.
>I'm unconvinced that you need the two schemes. I'd just allow each agent to
>expose any number of ports to other agents. If you feel so inclined, you
>could then set up access rights later on.
I wasn't thinking in terms of agents being allowed to expose ports,
but rather that all ports are potentially exposed. As far as pipes
(data flow connections) are concerned, agents don't specify where data
comes from or where they send data.
This lack of control by the agents is, I opine, fundamental in the
data flow paradigm. Agents, the boxes in a data flow diagram, are
black boxes. They have inputs and outputs, and defined transfer
functions; there is no other interaction from within the box with the
outside world. (No interaction normally - exceptions can breach the
barrier.) It is the "blackboxness" of agents that makes it easy to
couple them together.
The other feature of agents is that they have stream input and output.
There is a direct analogy here with unix utilities cobbled together
with pipes. A difference (other than the fact that agents are not
processes) lies in the internal structure. A unix utility typically
has a read loop reading input from stdin; an agent, on the other hand
has an event handler activated by input fom an input port, e.g., stdin
The point of this digression is that messages violate blackboxness
because a message has a target and agents are not supposed to know
about other agents. On this conception I have to agree with you that
there shouldn't be messages.
The reason that I chose messaging (other than the elegance of message
based OOP) is that I supposed that agents would need to talk to each
other directly, i.e., agents would need to get information from other
information. For example consider the following line from sample
msg to=scheduler, type=delay, delay=f-get, rsvp
The line of code sends a message to a scheduler saying that it (the
sending agent) wants its execution delayed by f-get time units. The
rsvp tag suspends agent execution until a reply is received.
I am going to have to rethink messaging; I suspect I can dispense with
it at the price of more structure in the executive level. For
example, the code
if (alpha) msg to=foo, text
else msg to=bar, text
can be replaced with
if (alpha) emit-$out1 text
else emit-$out2 text
with the executive layer determining what out1 and out2 are connected
to. The executive layer then has connection code like
bageldorf 1|0 foo
bageldorf 2|0 bar
(I don't much like the syntax m|n but I haven't thought of a good
alternative at the moment.) I'm not sure what one does to replace the
rsvp scheme. What is involved is suspend/resume and blocking.
>Why not have an agent set up it's own connections, or simply all an agent
>to leave a port exposed for some other entity to set up the connection.
See above. Once you have agents setting up connections they
internalize program structure information.
>Perhaps you had something in mind that makes the distinction neccessary.
>>From this one reading, I can't see it.
>(I'm presuming here that any unconnected output ports are either a reported
>error, or the language would require all unconnected outputs to have a
>default "devnull" agent receptor to simply perform any tidy-ups.)
My thought here is that writing to an unconnected output port would
raise an exception; this would be a "program fault" exception rather
than a "failed assertion" exception. A "devnull" agent might be
>> where the port numbers are either integers or nil, that being a
>> shorthand for port 0. Pipes can be concatenated on a single
>> line. For example:
>Why integers? Is there something inhierantly numerical about the port
>system? I'd prefer some sort of labeled port, perhaps with a default label
>of (say) port.
I sympathize. The issue is that it is IMO desirable to have a
standard notation for ports that is uniform across agents and
executives. If I call my ports port0, port1, etc, and the guy in the
next office calls them foo, bar, etc, then we have a conflict of
naming conventions. If we are all going to use the same naming
convention then it is useful if the language specifies the convention.
It doesn't much matter what the convention is as long as it provides
I will think about it though.
>> of bagle. A port can be connected to more than one pipe by using
>> separate connection statements. For example a "tee" is effected
>> foo | bar
>> foo | bagle
>Could perhaps a tee work in the opposite direction? Perhaps someone making
>a filter agent which takes a single input but repeats them back out through
>one of two ports.
>scan_db | classify
> classify |gizmos gizmo_receptor
> classify |doodads doodad_recpeptor
>> Data is sent to pipes by "emit" commands. The emitted data can
>> either be in the form of a character stream, a stream of words
>> (strings), or a list of strings.
>I'd strongly suggest pairing up the string with some sort of identity
>Without it, people will either come up with some "x=y" syntax, or use the
>list mechanism with some sort of informal specification defining which
>string means what.
>The ability to put a list of strings in a single package down a pipe is a
>good thing IMO. No need to perform any handshaking.
>I'd go further in allowing the list entity to contain another list, each
>with it's own datum-id.
My thought that this is headed in a bad direction. Linear streams are
simple and elegant. Trying to structure the streams with ID's,
variant records, attribute-value lists, etc, creates nasty
complications. The end result is that you need type definitions that
agents on both sides of a pipe have to access in order to be able to
write to the pipe and read from it. That leads to type definitions as
globals and then to include files and other such trash.
The alternative I am inclined to go with is a variant of Python's
serialize. The idea would be to package stuctured data in an object
and send the serialized object; the receptor would deserialize it to
recover the object.
>> In addition, blocking markers
>> can be inserted into the stream with "mark" commands. Emit
>> commands optionally contain a port number and a mode; the
>> defaults are port 0 and character stream mode. For example
>Why stop there? Why not allow different datums of different types to go
>down the channel. Even allow the programmer to define ADTs?
>> does not know where the data is coming from. Messages, on the
>> other hand, contain within them "from" and "to" fields. For
>> example, a message might be send with code like:
>I would suggest you come up with a way for an agent to easily set up a
>temporary pipe, rather than introduce a new syntax for a similar concept.
Agents can't set up pipes, temporary or not. That said, I am
>> Q: Agents are analogous to processes in a *nix system; is it
>> feasible or desirable to reuse process management code from a
>> linux or free bsd kernel? Similarly, agents are analogous to Ada
>> tasks. Is it feasible or desirable to reuse (or use as a model)
>> Ada task management code?
>Are you concerned right now with definining the language or designing the
Both definition and implementation. Implementation concerns cast
shadows on definitions.
>> Q: By design San is thread-safe between agents since there is no
>> data sharing. Likewise procedures and functions are thread-safe.
>> Assignment within on units is nominally thread safe in that all
>> code statements are atomic at the code level. Is this enough?
>Unless each agent can have more than one thread (I'd oppose that idea) or
>agents cannot use any resource except via pipes/messages, I can't see how
>thread safety would be an issue.
I'm rather thinking that agents can't use any resource except via
pipes/messages, but that there can be more one thread active in an
agent. Each event (pipe input, message, event, exception) is handled
by an "interrupt" handler, and an active handler thead can be
pre-empted by another handler thread.
>So long as your pipe/message mechanism is thread safe, but that's for the
>language implementor to worry about, not the san-programmer.
>> Q: San does not have any inter-agent globals nor does it have
>> globals within agents that extend across procedure/function
>Call me a bad person, but I think that an agent should be allowed to have
>"globals" within the agent itself, so long as you restrict one thread per
You'e a bad person. :-)
My thought is that agents can have state and that the code within the
handlers can access that state. Procedures/functions defined within
the agent do not have access to those "globals".
>Without globals with an agent, programmers may end up organising opaque
>"state" datums, which must be passed back to the caller and requiring the
>callee to pass it back for the next time.
>> Q: I haven't quite decided on the agent creation/termination
>> rules. I could use a direct analog of fork and kill, but they
>> don't seem quite right. Are there any suggested alternatives?
>I'd go for a "spawn" mechanism. Allow the code to specify the type of agent
>(say "DatabaseProxy") and give it a name (say "mydatabase") and then it
>gets created from the template.
>As for forks, if an agent decides it needs a clone of itself, it could
>spawn an instance of it's own type.
>On the other hand, a "clone" mechanism for stateful agents may be useful. I
Cloning is a language feature. You can say something like
set x = y
which makes x a clone of y, whatever y happens to be, including
>> Q: What other questions should I be asking?
>What tasks do you invisage your language being used for?
Almost anything that I happen to be interested in doing. :-)
That includes using it as a shell, as a scripting language, for
algorithm research, for doing some AI work, and a task to be
determined at a future date.
>>  It is best to think of variables  in San as being bound to
>> values rather than being pointers to values or addresses of
>Sounds reasonble. You may want to come up with a "handle" system with
>destructors, for things that can't be expressed as a single datum.
>(Connections to a database server perhaps. See the current thread about C#
I've been following the thread. The issues are a bit different in San
because resources are mostly managed under the hood.
>>  San names can have hyphens in them.
>What does your subtraction syntax look like?
Infix arithmetic operators have to be surrounded by spaces.
(a - b) is a minus b. (a-b) is the variable named "a-b". It is valid
to have a variable named x+y*z. I know, it's bizarre. I would like
to have variables that are parsable as arithmetic expressions be bound
by default to said expressions. It turns out to be convenient.
>Bill, by the way, my own language, Aposta-C, (formerly known as POT) is
>still under development. Expect an RFCAS soon.
Can I use it as an implementation language?
Richard Harter, firstname.lastname@example.org
I used to do things that were good for me because they were fun.
Now I do them because they are good for me. Fun was better.