Re: Python memory handling



Hello,

frederic.pica@xxxxxxxxx wrote:
I've some troubles getting my memory freed by python, how can I force
it to release the memory ?
I've tried del and gc.collect() with no success.

[...]

The same problem here with a simple file.readlines()
#Python interpreter memory usage : 1.1 Mb private, 1.4 Mb shared
import gc #no memory change
f=open('primary.xml') #no memory change
data=f.readlines() #meminfo: 12 Mb private, 1.4 Mb shared
del data #meminfo: 11.5 Mb private, 1.4 Mb shared
gc.collect() # no memory change

But works great with file.read() :
#Python interpreter memory usage : 1.1 Mb private, 1.4 Mb shared
import gc #no memory change
f=open('primary.xml') #no memory change
data=f.read() #meminfo: 7.3Mb private, 1.4 Mb shared
del data #meminfo: 1.1 Mb private, 1.4 Mb shared
gc.collect() # no memory change

So as I can see, python maintain a memory pool for lists.
In my first example, if I reparse the xml file, the memory doesn't
grow very much (0.1 Mb precisely)
So I think I'm right with the memory pool.

But is there a way to force python to release this memory ?!

This is from the 2.5 series release notes (http://www.python.org/download/releases/2.5.1/NEWS.txt):

"[...]

- Patch #1123430: Python's small-object allocator now returns an arena to
the system ``free()`` when all memory within an arena becomes unused
again. Prior to Python 2.5, arenas (256KB chunks of memory) were never
freed. Some applications will see a drop in virtual memory size now,
especially long-running applications that, from time to time, temporarily
use a large number of small objects. Note that when Python returns an
arena to the platform C's ``free()``, there's no guarantee that the
platform C library will in turn return that memory to the operating system.
The effect of the patch is to stop making that impossible, and in tests it
appears to be effective at least on Microsoft C and gcc-based systems.
Thanks to Evan Jones for hard work and patience.

[...]"

So with 2.4 under linux (as you tested) you will indeed not always get the used memory back, with respect to lots of small objects being collected.

The difference therefore (I think) you see between doing an f.read() and an f.readlines() is that the former reads in the whole file as one large string object (i.e. not a small object), while the latter returns a list of lines where each line is a python object.

I wonder how 2.5 would work out on linux in this situation for you.

Paul
.