Re: ANNOUNCE: Colibri version 0.1



vitic wrote:
I real world, what can this be used for? Are we missing any
functionality/power right now without it? Any examples? It is not
clear to me, although I did read the wiki posting.

Regarding memory management, the main real-world improvement would be handling of cyclic structures. Tcl currently uses refcounting, which cannot handle such cycles because objects that take part in a cycle never see their refcount decrease to zero. At present there is no way at the script level to create such structures (that is, as first-class objects such as lists), so the current implementation is adequate from a pure Tcl point of view. However the deep nature of Tcl from its very beginning has been a dual script/system (C API) approach. IOW, Tcl is as much a scripting language as it is an extension library. The problem is that there is no way to prevent the creation of such cycles at the system level. This means that an application can perfectly use Tcl as a type library, building lists that self-reference, and everything works fine as long as you don't generate string reps ( infinite loop) or expect memory resources to be freed (=> leaks).

One could think that such cyclic structures are not that frequent in real-world apps, but this is wrong. For example, the Document Object Model defines a tree model where nodes maintain a reference to their parent; this has the side effect of creating cycles at every level of the dataset. Lacking this up-reference, nodes would need either external or contextual info to navigate up the tree hierarchy. The lack of cyclic structure handling is the reason why DOM trees cannot be implemented as plain Tcl objects (e.g. hierarchical lists) but need a custom script-level object wrapper or a lower-level approach (e.g. access by named references in an array or dict, which is somewhat similar to C pointers with all their drawbacks).

Python uses refcounting as well but needs a special cycle-detection pass to get rid of these cycles. So proper handling of cyclic structures needs a full-fledged garbage collector like the one that Colibri provides.

But such low-level plumbing, while necessary, is not sufficient to properly handle cyclic data, as the script level enforces the contract that Everything Is A String (EIAS). Therefore there must be some way to generate the string representation of cyclic structures. Cloverfield proposes a serialization format at the syntactic level that allows that. And I've also discovered that Common Lisp provided the same kind of solution that Cloverfield proposes, but as a write-only format, through the use of the *print-circle* flag. So I'm confident that this can be done.

Of course Colibri can be used as is to implement automatic memory management in any application, and I'm confident that it would be reasonably fast. It is in no way tied to Tcl or dependent on its semantics (e.g. EIAS).


Regarding the other aspects of Colibri, namely ropes, I think that they are a great alternative to flat strings and are much more practical when handling large data. Moreover, they allow procedural or lazily computed data, or memory-mapped files that far exceed the available memory. Things that cannot be done with standard strings. So Colibri can be used as a string management library in any application as well.
.



Relevant Pages

  • Re: TIP #185: Null Handling
    ... reconstructed from its string representation. ... this is how Tcl works. ... Tcl database API ought to work. ... extension, it would be a good idea for the Core to provide ...
    (comp.lang.tcl)
  • Re: [Re:] question about character encodings with Tcl interpreter embedded in C++
    ... > in UTF-8, the internal encoding), than you don't want to use ByteArray ... string I got from the outside world, unbeknownst to Tcl. ...
    (comp.lang.tcl)
  • Re: Is garbage collection here yet?
    ... I don't consider tcl to be a fully higher-level language. ... Tcl has so many other cool features and such a clean ... As others have answered, Tcl does do ref-counting GC of its values, which works fine for strings as they can't contain circular references and are stateless/immutable. ... Basically, it's hard to distinguish a reference from any other string, which makes it difficult to know when it is safe to delete something. ...
    (comp.lang.tcl)
  • Re: Obstacles for Tcl/Tk commercial application development ?
    ... The real problems that people at my team have experienced with TCL ... We all know than a list has a special string representation. ... it is very error prone for non experienced TCL programmers. ... maybe it suggests that the TCL language has ...
    (comp.lang.tcl)
  • Re: "string map" causes problems with german umlauts since version tcl8.4
    ... I extended the example a bit and let it run on tcl 8.4.13, on tclkitsh ... % string length $b ... The tcl shell is not able, but the text widget of the ...
    (comp.lang.tcl)