Re: Efficient use of results of [binary scan]



On Mon, 11 Apr 2005 13:31:48 +0100, "Donal K. Fellows"
<donal.k.fellows@xxxxxxxxxxxxxxxx> wrote:

[...]
>> Works fine, but it's a bit slow. [binary scan] is
>> impressively fast, but the [foreach] is not - it
>> takes about 80% of the time, on the 100K string
>> I tried. [binary scan] uses less than 20%.

>Let me start by explaining why [binary scan] itself is fast. The trick
>is that it shares the objects representing the values scanned (and
>written into the list in $signed_bytes) so that memory allocation is
>minimized. And memory allocation is *definitely* comparatively slow
>(nothing that Tcl does would detract from that; it doesn't preallocate
>several megabytes of space for Tcl_Obj instances). The same trick is
>applied in [scan $data ""], which is actually where it was developed first.

Fascinating stuff.

>Given that, the source of the slowness is almost certainly the fact that
>the output of the [expr] in that loop is not (and cannot be) shared
>automatically (without really significant effort). We end up allocating
>100K objects, and that hurts (minimum 2.4MB memory allocated and
>initialized). If that is the source of the problem, the answer is to
>find a way of sharing the output of the calculation. So try this:
>
> proc unsigned_bytes raw {
> set cvt {}
> for {set i 0} {$i<256} {incr i} {lappend cvt $i}
> binary scan $raw c* signed
> foreach byte $signed {
> lappend unsigned [lindex $cvt [expr {$byte & 255}]]
> }
> return $unsigned
> }

That solution would never have occurred to me! However, the
bad news is that it is almost exactly the same speed as my
naif solution, for my 100K string. I'll put a little effort
into trying various options and various data sizes, to try
to find what's hurting most.

In the end I suspect I'll put a bit more effort into making
it possible to use the signed output from [binary scan]
instead of coercing it all to unsigned before working on it.

Despite all that, though, wouldn't it be a good idea to
offer unsigned conversion options on [binary scan]? I can't
believe I'm the only person who needs it.

Thanks again
--
Jonathan Bromley, Consultant

DOULOS - Developing Design Know-how
VHDL, Verilog, SystemC, Perl, Tcl/Tk, Verification, Project Services

Doulos Ltd. Church Hatch, 22 Market Place, Ringwood, BH24 1AW, UK
Tel: +44 (0)1425 471223 mail:jonathan.bromley@xxxxxxxxxx
Fax: +44 (0)1425 471573 Web: http://www.doulos.com

The contents of this message may contain personal views which
are not the views of Doulos Ltd., unless specifically stated.
.



Relevant Pages

  • Re: Large strings and out of memory
    ... > should do the trick, ... The prob is that the first memory ... actually the entire string contains important data ... >> So it seems like Delphi creates a whole new memory allocation containing ...
    (comp.lang.pascal.delphi.misc)
  • Re: The Strinx Library
    ... The Strinx philosophy is that a library should not enforce a ... >> technique. ... >> string. ... >> an extra memory allocation and copy. ...
    (comp.lang.cpp)
  • A neat trick to serialize arrays and hashes
    ... who thinks that my trick is "obvious to everyone but inexperienced ... hashes -- that is, it can pack and unpack arrays and hashes to and ... I might want to serialize @a into a string for the purpose of storing ... array into a string like so: ...
    (comp.lang.perl.misc)
  • Re: HLA Lib
    ... then have ten times that amount wasted by a memory allocation call. ... string management information into only nine bytes. ... Even on 32-bit systems, virtual memory size doesn't matter much, and as ... but assembly language programmers should learn to pay attention to ...
    (alt.lang.asm)
  • Re: Problem using copy.copy with my own class
    ... have a string associated with the int. ... By subclassing int to add a string, ... How do I know when the memory allocation ... between pickling and copying here. ...
    (comp.lang.python)