Re: GC performance - GC fragility
- From: Eric Grange <egrangeNO@xxxxxxxxxxxxxxx>
- Date: Tue, 29 Jan 2008 10:19:38 +0100
The point you are missing, that it's still a linear address space, or isn't the address space each process has 2 GB linear ?
Isn't the whole process and the system using virtual memory ?
Isn't GC >not using< virtual memory ?
GC is allocating memory linearly and as if virtual memory space "holes" were evil, but the point is that virtual memory "holes" only matter when you allocate large blocks (ie. blocks larger than whatever the granularity when requesting from the OS, which is usually a few hundred kb to several MB depending on allocators).
When you allocate such large blocks, regardless of your strategy you're going to hit a wall well below the 2GB because the virtual memory already comes with locked blocks when your app begins.
In 64bit, this becomes a complete non-issue for an allocator like FastMM, and GC compaction becomes an artefact of the GC approach: virtual space is large enough so that fragmentation of actually allocated memory is a non issue, but the GC still has to compact because its allocation strategy litters stuff all across the memory pool, meaning if the GC doesn't compact it'll run out of physical memory.
The only way to (somewhat) directly address and map physical memory under Windows x86 I know is using AWE.
If you're hitting the 2GB barrier because of large blocks, you can consider yourself toast whatever the methodlogy (GC or FastMM, though in practice you'll have more trouble with a GC due to the higher memory waste of GC allocators).
If you're hitting the 2GB barrier because of small blocks, a strategy like FastMM's means fragmentation isn't an issue.
Virtual memory only means, that physical memory is virtually mapped to a virtual memory address. But mapping >used< memory to another address location would make the application some trouble, since it's using linear addresses ? Wouldn't it ?
Not if it happens within a realloc.
--> GC also uses virtual memory functions to allocate memory.
But it uses them to allocate a linear heap.
--> Linearity of GC memory is too only virtual
Why "too"? FastMM isn't linear.
--> If not GC could only allocate memory in a linear free block
--> which would be quite hard to find
That's what ancient allocators did (like the old RTL one), but that's not a good strategy in a virtual memory world (both from a memory usage point and from a performance POV).
FastMM uses VirtualAlloc which returns a block of memory -> so what ?
If these blocks are somehow distributed in the linear address space it has no effect on the fragmentation of the virtually linear heap.
The FastMM isn't linear. Ancient RTL's MM was linear.
Linear allocation strategy doesn't bring any benefits in a virtual memory world.
It's simply a linked list of memory blocks of a specified granularity in a linear address space.
No it's not, there are multiple sets, which are more in a stack structure than a linked list. There is no linked list structure as you give in your example where you have a free/allocated/free/allocated chain, that was in ancient heap managers.
If you have 1 byte of application heap allocated in each of 1000 virtual allocate blocks of 64K they still would be allocated and couldn't be freed -> heap fragmentation.
Why would you have 1 byte in a 64k block in the first place? How would you get there with FastMM?
The only way to get there is with a linear allocation strategy.
But this is also true for GC. You can always find examples where
the one or the other heap wins.
Well that's where I beg to strongly differ, we have yet to see one example where the GC wins by a fair margin, while there are lots and lots of cases where it was found to lose by an order of magnitude.
When managed applications are in practice slower than native> ones it's not >only< GCs fault.
No it's not, but when the fault is that of the GC, you're toast. Other sources of slowdown can be worked around (though some will take an inordinate amount of work), but if you're hitting the GC barrier, then that's bad as you'll have to give up on core libraries and facilities (since they all rely on said GC).
So why does FastMM distinguish between different allocation sizes - IIRC 2 or 3 -> because of fragmentation ?
To avoid the fragmentation of a linear heap yes, and also for cache efficiency purposes (you may want to have a look at the google archives, there were some very interesting performance situations uncovered).
virtual memory that wouldn't be a problem at all (and in 64bit there are virtual memory solutions to reallocating large blocks to arbitrary sizes without a copy).
You can't map memory on your own in Windows. You can only reserve memory and then later commit it for usage.
You can't decide exactly where things will be mapped, but you have a fair amount of control. The strategy I was referring here is that of memory mapped files, which can be done on purely in-memory files (ie. no HDD access). This allows to map parts of large blocks, or remap a large block to another location (changing its size in the process).
The point is even in 64 bit you can't simply reserve the whole 64 bit address range, since mapping tables are limited too.
The point is, you wouldn't need to. Virtual address space by the simple virtue of being nearly unlimited (relatively to physical) already opens up a lot of allocation strategies.
It depends on the allocation scheme. If the heap is fragmented, it doesn't mean that the applications has stalls or runs out of memory. Simply it's using somewhat more memory.
There again, that's a purely theoretical statement.
I would like to see one example in practice where a GC-based application (long-running or not) uses less memory than FastMM. So far all examples that have been mentionned had the GC using significantly more memory (both physical and virtual).
GC-based alternatives, where the GC stability is so poor that MS doesn't
Any links ?
A basic google search would have given you plenty, here is a random link from google's top picks
http://blogs.msdn.com/maoni/archive/2004/09/25/234273.aspx
Server GC is very explicitly a blocking GC, it's not very hard to understand why.
And I can easily blow off a native heap by fragmenting it.
Ok. Try your best with FastMM. Let's see how much memory you need to allocate before having trouble, and then we'll see how much memory I need to allocate to get the .Net GC in trouble :)
If I allocate memory:
You're still thinking ancient heap allocators, like ancient RTL MM, or as if you were using a linear GC allocator.
But anyway, that's all hot air, I suggest you go ahead and try to wreak havok on FastMM's allocation strategy. Let's see how much code and memory it takes to do that. Then, we'll see ;)
Eric
.
- Follow-Ups:
- Re: GC performance - GC fragility
- From: Andre Kaufmann
- Re: GC performance - GC fragility
- References:
- GC performance - GC fragility
- From: Eric Grange
- Re: GC performance - GC fragility
- From: Atmapuri
- Re: GC performance - GC fragility
- From: Andre Kaufmann
- Re: GC performance - GC fragility
- From: Micha Nelissen
- Re: GC performance - GC fragility
- From: Andre Kaufmann
- Re: GC performance - GC fragility
- From: Eric Grange
- Re: GC performance - GC fragility
- From: Andre Kaufmann
- GC performance - GC fragility
- Prev by Date: Re: Running Delphi in Virtual Machine
- Next by Date: Re: A suggestion regarding all these cross-platform, Kylix, Lazarusposts...
- Previous by thread: Re: GC performance - GC fragility
- Next by thread: Re: GC performance - GC fragility
- Index(es):
Relevant Pages
|