Re: Blind Fastcode Memory Manager Challenge

From: Pierre le Riche (pleriche_at_hotmail.com)
Date: 11/09/04

  • Next message: Dennis: "Re: Intel CPU Performance Monitoring Events"
    Date: Tue, 9 Nov 2004 08:10:18 +0200
    
    

    Hi Dan,

    > could pretty much always beat the default MM. I was excited when I saw the
    > results of the NexusMM v2 and how well it scaled compared to the default
    > MM.
    > I'm curious though, did you design your MM to handle smp systems, or was
    > that a nice side effect?

    In the past I used HPMM, because the RTL MM used to fragment quite heavily
    and it's really not suitable for multi-threaded applications (due to its
    rather severe locking mechanism). Unfortunately HPMM has a lot of memory
    overhead, which is not so great. I recently looked at a few of the other MMs
    that are available for both Delphi and some other languages and decided to
    try and combine all the best ideas that I had come across.

    My inital aim was to write a MM that is fast with single-threaded
    applications, because that is what I write 90% of the time. However, I also
    wanted it to be as fast as I could make it with multi-threaded apps without
    affecting the single-threaded speed. I tried avoiding locks completely (like
    NexusDB v2 does), but that causes so much overhead that single-threaded
    performance suffers. I also tried using different "arenas" like some of the
    Linux ones do (in effect creating another memory manager with its own memory
    pool if all other managers are currently busy), but that raised the memory
    overhead.

    Eventually I figured that if I could just make the time spent in servicing
    GetMem and FreeMem calls as short as possible, it wouldn't matter that much
    under SMP systems that I do use locking - after all, there's something
    seriously wrong with the design of an application that spends all its time
    allocating and deallocating memory. I eventually stuck with the locking
    mechanism that is also used in RecyclerMM. Each small block size is locked
    individually and large blocks aren't locked at all. With enough small block
    sizes, the probability of a collision between threads can be reduced
    significantly. I tuned the number of block types and sizes until I found a
    combination that worked well.

    NexusMM is probably the way to go if you are planning on buying a quad CPU
    system, but in single (and sometimes dual) CPU systems the extra overhead of
    avoiding locking completely just doesn't seem to pay off.

    Regards,
    Pierre


  • Next message: Dennis: "Re: Intel CPU Performance Monitoring Events"

    Relevant Pages

    • Re: How come Ada isnt more popular?
      ... In an imperative language like Ada, ... > overhead from features that you don't use). ... bound or manually managed memory allocated outside the GC'ed heap. ... I can imagine to allocate some memory in a special, ...
      (comp.lang.ada)
    • Re: 80386 support in -current
      ... - General optimization of locking. ... We now need to look at lock granularity. ... the number of locks using mutex pools to lower memory ...
      (freebsd-current)
    • Re: find -exec surprisingly slow
      ... > somewhere in the vicinity of 400K files in it. ... The overhead for starting new ... There is however an upper limit to how much memory will be used ...
      (freebsd-questions)
    • Re: JRuby disabling ObjectSpace: what implications?
      ... it depends on the overhead and on the invocation model. ... sound more like there is one JVM for JRuby programs... ... the way they do (object references are no pointers to memory locations). ... You just traverse the list ...
      (comp.lang.ruby)
    • Re: Another New Hardware Thought
      ... which adds data copy, interrupt, and context switching overhead. ... memory manager would become complex enough to be half an OS in its own ... On a machine with 4mb of RAM, ... Then you have the realtime cost of swapping itself, ...
      (comp.sys.apple2)