Re: TclX loop slow in the default case



Alexandre Ferrieux wrote:
On Aug 29, 5:48 pm, Alexandre Ferrieux <alexandre.ferri...@xxxxxxxxx>
wrote:
On Aug 29, 12:42 pm, miguel <mso...@xxxxxxxxxxxx> wrote:

Alexandre Ferrieux wrote:
Random idea: what about extending the bytecode compiler API so that it
can be passed a non-empty "context" (compiledLocals, possibly
literals, etc) instead of starting afresh ? This way, in the dominant
case where the upper context is also a compiled proc, the body-
bytecode would "land" among the upper locals and access them at full
speed (not even a linkvar). Or is such sharing of lists of compiled
locals impossible in the current setup ?
Done in 8.6 HEAD since 2008-06-08 [Patch 1973096]
Guys, what's nice with Miguel is that not only does he make my drems
come true, but sometimes he does it in advance just to be sure 8^P

Update: while the above two lines certinly hold regardless, I'm
disappointed to report that in HEAD (8.6a3), the uplevelled-for still
crawls at hardly more than half the speed of the naive for-uplevel:

$ ./tclsh86.exe
% proc tfor {} {for {set x 0} {$x<10000} {incr x} {}}
% proc tup {} {uploop x 0 10000 {}}
% proc uploop {var from to body} {
upvar 1 $var v
for {set v $from} {$v <= $to} {incr v} {
uplevel 1 $body
}
}
% proc loop {var from to body} {
uplevel 1 [list for [list set $var $from] "\$$var<$to" [list
incr $var] $body]
}
% proc tloop {} {set x 0;incr x;loop x 0 10000 {}}

% time tfor
751 microseconds per iteration
% time tup
5731 microseconds per iteration
% time tloop
10593 microseconds per iteration

Any idea why ? (notice that in tloop I took steps so that the upper
context has a compiled local for the iteration variable).

I can talk about this ... you decide if any of this makes sense (and time it if you want).

(1) tup is calling [uplevel 1] once per iteration, which implies a bit of activity; tloop is calling it just once: ADVANTAGE tloop

(2) tup and tloop are both calling TEBC to run $body once per iteration, it's a tie

(3) tup is accessing $v by index; tloop by name: ADVANTAGE tup

And the winner is one of (or should be) ...

proc cheatloop {var from to body} {
set nbody "for {set $var $from} {incr var} {\$var < $to} {$body}"
uplevel 1 $nbody
}
proc cloop {cheatloop x 0 10000 {}}

(Note that this is very similar to your tloop, but carefully bypassing the list optimisation - we do want the [for] to be bytecompiled, not just $body)

Another contender for the championship would be

proc cheatloop2 {var from to body} {
set nbody "set $var $from\nwhile 1 {$body\n if {\[incr v\] >= $to} break}"
uplevel 1 $nbody
}

Yes! Single [uplevel] call, the evaluation of $body's bytecodes occurs without exiting and re-entering TEBC, loop test access the var by index!

What I seem to be missing is the point of this exercise ...
.