TIP #192: LAZY LISTS
From: Salvatore Sanfilippo (antirez_at_nospam.invece.org)
Date: 05/14/04
- Next message: Peter Wang: "regexp with special character like $ in string"
- Previous message: lihong pei: "Any plan for win64 support in Tcl??"
- Next in thread: Andreas Leitgeb: "Re: TIP #192: LAZY LISTS"
- Reply: Andreas Leitgeb: "Re: TIP #192: LAZY LISTS"
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
Date: Thu, 13 May 2004 22:15:02 GMT
This TIP is not so new, but there was some problem (probably
with c.l.t.a) so it never reached this group. So I decided to post
it directly.
Regards,
Salvatore Sanfilippo
TIP #192: LAZY LISTS
======================
Version: $Revision: 1.2 $
Author: Salvatore Sanfilippo <antirez_at_invece.org>
Theo Verelst <theover_at_tiscali.nl>
State: Draft
Type: Project
Tcl-Version: 9.0
Vote: Pending
Created: Saturday, 27 March 2004
URL: http://www.tcl.tk/cgi-bin/tct/tip/192.html
Post-History:
-------------------------------------------------------------------------
ABSTRACT
==========
This TIP proposes to add a new command to generate lists of /N/
elements, where the /i/-th element is computed as the result of an
unary Tcl procedure with /i/ as itsargument. Implementing special
handling for this kind of lists inside the Tcl core will allow
generation of lists in a /lazy/ way. This TIP's goal is not to change
the semantics of Tcl, but just to provide a different space complexity
for an (often interesting) subset of Tcl lists.
RATIONALE
===========
A subset of Tcl lists can be generated mapping an unary function to an
integer sequence in the range [0,n), where /n/ is the length of the
list. The following procedure implements this concept:
proc lgen {len func} {
set l {}
for {set i 0} {$i < $len} {incr i} {
lappend l [uplevel 1 [list $func $i]]
}
return $l
}
and the following (using the *lambda* from [TIP #187]) is an example of
usage:
proc lambda {argl body} {
set name [info level 0]
proc $name $argl $body
set name
}
set mylist [lgen 10 [lambda x {incr x; expr $x*$x}]]
The above code will evaluate to the same as [list 1 4 9 16 25 36 49 64
81 100]. *lgen* can be used in order to build a particularly useful Tcl
procedure named *range*, returning a sequence of integers with given
/start/ /end/ and (optionally) /step/ paramenters. The *range* function
is convenient for rewriting many *for* loops in terms of *foreach*
iterating over an integer range. So instead of writing:
for {set i 0} {$i < 20} {incr i 2} {
puts $i
}
It is possible to write:
foreach i [range 0 20 2] {
puts $i
}
That is more convenient to write and to read for the programmer. Of
course it's possible to use *foreach* to iterate any sequence that
*lgen* is able to generate, but *range* is probably one of the more
common of the possible usages.
The TIP proposes to implement the ability to handle this kind of lists
in a special way directly inside the /List/ object implementation. This
allows these common usage patterns of the list object to not need to
hold the real sequence but just the length and the unary element
generator function.
The interface to the Tcl programmer is a single command similar in
semantics to *lgen*, but possibly with a more suitable name.
PROPOSED CHANGE
=================
The /List/ object should be modified in order to have the lazy-list as
subtype or alternate internal implementation. All the core should use
the proper /List/ object API instead to access to the /List/ object via
the internal representation.
The two calls that should be guaranteed to not alter the /lazyness/ of
the list are *Tcl_ListObjLength()* and *Tcl_ListObjIndex()*. Other
calls like *Tcl_ListObjReplace()* may be optimized for the lazy-list
case when possible, and the *lrange* command may be optimized
(particularly when the start index is zero).
The /List/ will be converted into a non-lazy version if the user tries
to modify it, for example using the *Tcl_ListObjAppendElement()*
function. The /List/ will also be converted into a non-lazy version on
*Tcl_ListObjGetElements()* calls.
It's possible to handle the *range* command as a particular case of
lazy-list in order to provide a very fast implementation of foreach
iterating over an integer range (probably much faster than the today
*for*, being *foreach* already faster iterating over a literal list of
numbers).
CONSEQUENCES: REFERENCE MANAGEMENT
------------------------------------
The author of this TIP tried to implement the proposed changes in the
HEAD, discovering that the *Tcl_ListObjIndex()* interface creates a
serious problem due to the assumption that the /List/ object holds at
least one reference to the returned element. The implementation of
*Tcl_ListObjIndex()* in the lazy case can't just create the element
object and return it with refcount of zero because it will leak if the
caller does not increment the reference count itself.
It's also not safe to store a reference to the last few elements
created in the lazy way inside the /List/ object, and release this
references in order to create more elements, because in theory the
caller may require a large number of elements storing pointers into an
array, and finally incrementing the reference counts in a single pass.
In order to avoid this problem, the semantics of *Tcl_ListObjIndex()*
should be changed in order to always return the element with an already
incremented reference count. It will be up to the caller to decrement
the reference count if the object will be discarded. (This is why this
change is proposed for Tcl 9.0 and not 8.5, as this has a significant
impact on both the core and on extensions.)
An alternative change to *Tcl_ListObjIndex()* (in order to make it
"compatible" with the semantics of lazy lists) is to disallow
successive calls against the same list if a previous call returned an
object that the caller plans to reference the object.
So:
Tcl_ListObjIndex(interp, myListPtr, 0, &a);
Tcl_ListObjIndex(interp, myListPtr, 0, &b);
mystruct->a = a;
mystruct->b = b;
Tcl_IncrRefCount(a);
Tcl_IncrRefCount(b);
will be invalid, while:
Tcl_ListObjIndex(interp, myListPtr, 0, &a);
Tcl_IncrRefCount(a);
mystruct->a = a;
Tcl_ListObjIndex(interp, myListPtr, 0, &b);
Tcl_IncrRefCount(b);
mystruct->b = b;
is valid. This fixes any problem because the /List/ object can just
take a reference to the last generated object and avoid any leak. A
study of existing code in the core and extensions is required to see
whether this will allow the majority of code to operate unchanged.
CONSEQUENCES: NON-CONSTANT LISTS
----------------------------------
The TIP also poses another problem in the case the unary function has
side effects. In this case the behaviour can be described without to
violate the Tcl semantic in terms of variable traces most of the time,
but actually it's possible to write code that shows that the list value
is non-stable even without using variables (because Tcl has no "object
trace" concept actually).
If this is considered a problem, the TIP may be reduced in order to
only allow lazy generated lists composed of integer ranges, (but that
is one of the most interesting advantages of this TIP anyway.) With
integer ranges there are no side effects, so the semantical problem is
not an issue, but the *Tcl_ListObjIndex()* problem is exactly the same.
Of course, this is arguably just an unavoidable consequence and shows
that not all possible unary element generator functions, but just those
with a functional denotation, are necessarily reasonable choices.
COPYRIGHT
===========
This document has been placed in the public domain.
-------------------------------------------------------------------------
TIP AutoGenerator - written by Donal K. Fellows
--
Salvatore Sanfilippo <antirez at invece dot org>
"Minimalism, simplicity and consistency are better guides (...) We suspect
that many of today's object-oriented languages could profit by dropping
features" -- Randall B. Smith and David Ungar, Programming as an Experience.
- Next message: Peter Wang: "regexp with special character like $ in string"
- Previous message: lihong pei: "Any plan for win64 support in Tcl??"
- Next in thread: Andreas Leitgeb: "Re: TIP #192: LAZY LISTS"
- Reply: Andreas Leitgeb: "Re: TIP #192: LAZY LISTS"
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
Relevant Pages
|