Re: loading csv-files: too much memory-consumption



Rainer Joswig <joswig@xxxxxxx> writes:

I still have no idea how likely it is that the stack in ECL will
contain an assumed pointer to heap data. Is that something that will
happen very rarely or only in special situations? What kind
of data on the stack would trigger it? (assuming that heap
address begin at some point and end, ...).

First, stack addresses are a limited range (or ranges) of integers.

On a linux system try on your (sh-like) shell:

cat /proc/$$/maps

to see a typical memory map, or try:

(com.informatimago.common-lisp.interactive:cat
(format nil "/proc/~D/maps"
#+clisp (posix:process-id)
#-clisp
(progn
(cerror "return your pid please" "How do we get the pid here?")
(format *query-io* "Pid? ")
(read *query-io*))))

to see the memory map of your lisp implementation.


Now, if only initialized data was stored on the stack (and in the live
parts of the heap) then basically the probability to take some bits
for a pointer in the heap would in general be proportional to the size
of the heap vs. the size of the address space.

However, in non-controlled programming languages (or "optimization"
levels), not all data is initialized. In that case it is rather more
probably to have a true pointer in random places, because the legal
adresses are already stored in memory, and are therefore subject of
copy and reuse in non-initialized variables.

malloc() allocates size bytes and returns a pointer to the
allocated memory. The memory is not cleared.

Therefore if a function written in a uncontrolled programming language
use this kind of non-initializing memory allocation, the probability
that this memory block contains an address previously left by a freed
structure that contained that pointer is rather higher.



So, basically, if you use half of your address space for heap, the
probability that a random word points in it is p=1/2. But if you use
C, that probability becomes much bigger.

Now, if you restrict the valid pointers to the first byte of each
allocated memory blocks, and count for an average of 8 or 16 byte per
memory block, obviously you reduce the default probably to 1/16 or
1/32, but this doesn't change much the higher probability of having
unerased valid pointers with unmanaged programming languages.



If some data on the stack is interpreted as a pointer and the
pointer would point to the middle of a large array on the
heap. Would that object then be recognized as a piece of a large
array, or would it cause marking of again non-precisely recognized
data?

It has been mentionned for ECL that only the address of the first byte
of each memory block is taken as a live pointer. This is somewhat
non-conservative, but we can impose this rule to C programs trying to
embed ECL: do not keep offsetted pointers to lisp data, or in packed
structures, etc.

By the same token, if you have a low-level primitive to convert a
reference to a lisp object into an integer containing its address,
you cannot count on the garbage collector to retain that lisp object
if no other reference remains.


--
__Pascal Bourguignon__
.



Relevant Pages

  • Re: Tech directions for Delphi?
    ... Nope, the gen 0 heap isn't just a bunch of pointers in a graph, it's also data bytes kept together so that you can perform incremental allocations. ... It's the same with FastMM, you just have less hoops around advancing the pointer. ... > of event logs) are commonly used but are a small proportion of apps written. ... The old memory manager could turn into an absolute horror in multithreading. ...
    (borland.public.delphi.non-technical)
  • Re: Pointer validity
    ... Implementations are allowed to do whatever ... impossible to classify a pointer value as stack or non-stack ... >> Pointers to valid memory locations can come from an external ... >> friends, for locating heap corruption, dangling references, etc. ...
    (comp.lang.c)
  • Re: Designing an item system
    ...   happy either as it is unwieldy for inventory and store management. ... then be cleaned up as soon as the pointer goes out of scope). ... contains probability tables for all item types and options. ... I haven't been able to force it into a fix memory structure yet. ...
    (rec.games.roguelike.development)
  • Re: thread specific information
    ... Some of these bugs are directly related to the aforementioned "programming techniques", so such things should always be viewed with caution. ... Heap is at best "casually" thread-specific. ... If it keeps that pointer to itself there's no reason for another thread to access it, ... Again, this effectively allocates GLOBALLY visible memory to which only one thread is granted a pointer; but there's nothing to prevent that thread from making a pointer visible to other threads, or to keep other threads from accidentally "scribbling" over the data via a random uninitialized pointer. ...
    (comp.programming.threads)
  • Re: Garbage collectable pinned arrays!
    ... Pinning is an explicit ... I've already given two examples of APIs in widespread use which require a buffer to stay in one position after the initial function call which accepts the pointer. ... That means a one time cost to pin a buffer that lives until the end of the process, if you do this early in the process you won't suffer from fragmentation of the gen0 heap as this object will end on the gen2 heap anyway. ... If this doesn't suits your needs, then you will have to use the Marshal class or GCHandle.Alloc, carefully considering it's costs. ...
    (microsoft.public.dotnet.languages.csharp)