96-bit lockless counter
From: Terje Mathisen (terje.mathisen_at_hda.hydro.com)
Date: 03/20/04
- Previous message: Terje Mathisen: "Re: Lockless 63-bit Counter"
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
Date: Sat, 20 Mar 2004 09:32:05 +0000 (UTC)
I mentioned a day or two ago that I had figured out a way to support
longer counters, with better performance than the current 63-bit code,
well here it is:
The crucial idea is that _only_ the bottom 32 bits of the counter is
shared, all other processes maintain their own copy of the higher parts!
This is easy to do if you can guarantee that all threads will sample
(not neccessarily upate!) the counter at least once per full 2^32 round:
Since the basic update (LOCK XADD [counter],EAX) must use a full
read-modify-update bus cycle, it cannot take less time than the basic
bus speed, right?
Assuming this can increase a _lot_, getting down to current cache
speeds, we're still talking about at least 10 ns per update for the next
decade or two.
This means that the counter cannot roll over more often than once every
42 seconds, more probably it will be in the several minutes range.
If you cannot guarantee that all threads will wake up often enough to do
this, then it can still be solved if you setup an extra helper thread
that wakes up each second, and goes around to all of the separate
counters, updating the high counts if needed.
Here's the sampling code:
sampleAndIncrement: ; EBX -> local copy of the full-sized counter
mov eax,1
lock xadd [counter],eax
cmp eax,[ebx]
mov ecx,[ebx+4]
mov edx,[ebx+8]
; If we have wrapped around, the high parts must be updated.
; This can be done branchlessly using the carry from the CMP:
adc ecx,0
adc edx,0
mov [ebx],eax
mov [ebx+4],ecx
mov [ebx+8],edx
ret
However, it is almost certainly faster to skip the updates that aren't
needed:
mov [ebx],eax
jnc done
adc ecx,0
adc edx,0
mov [ebx+4],ecx
mov [ebx+8],edx
done:
ret ; EBX points at counter,
;which is also in EDX:ECX:EAX
Terje
-- - <Terje.Mathisen@hda.hydro.com> "almost all programming can be viewed as an exercise in caching"
- Previous message: Terje Mathisen: "Re: Lockless 63-bit Counter"
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
Relevant Pages
|