Avoiding deadlocks in concurrent programming



This is not really Python specific, but I know Python programmers are
among the best in the world. I have a fair understanding of the
concepts involved, enough to realize that I would benefit from the
experience of others :)

I have a shared series of objects in memory that may be > 100MB. Often
to perform a task for a client several of these objects must be used.
Since many clients can be making requests at once (>100 per second
during peak load) these objects must be protected with some method of
thread syncronization (threading.Lock would do fine.) This is somewhat
complicated by a backup thread which takes the data every 10 minutes
and exports it all to a format that can be saved to disk. Clearly the
backup thread must have exclusive access to all of these objects at
once, and there must be no half-completed transactions underway.

The easiest way from a design stand point is have a single lock and let
clients have exclusive access to everything even although they only
ever need access to a fraction of the data. This makes it easy for the
backup thread, and it ensures that there's no trouble with deadlocks or
with partially completed transactions. However imagine what would
happen if there's 100 clients in 100 threads waiting for access to that
lock. One could be almost finished with it, and then 100 threads could
get switched in and out, all doing nothing since they're all waiting
for the lock held by the one thread.

So I think you would need multiple locks so clients only acquire what
they need. This would let multiple threads access the data at once. But
now I have to deal with deadlocks since clients will usually acquire a
resource and then block acquiring another. It is very likely that one
client locks A, another locks B, then the guy with B waits for A and
the guy with A waits for B. Worse yet the backup thread will go around
trying to lock everything and will no doubt deadlock everybody.

How do you avoid this? I thought instead of blocking forever you could
try non-blocking acquires in a loop with a timeout of a few seconds, if
the timeout is reached, release any locks held and try again later,
with the backup thread being the only one to use blocking acquires
since it would never complete it's task otherwise.

No doubt this is a common problem, how would you people deal with it?

-Dan

.



Relevant Pages

  • Re: Lock a row
    ... and if the connection dies you are not even going ... Use the readpast hint as mentioned by Hugo to keep any clients ... >> row from a wrok-queue table and acquire an exclusive lock; ...
    (microsoft.public.sqlserver.programming)
  • Re: Client/Server Fujitsu Cobol appplications
    ... Running the clients with shared file access on the server ... Fujitsu Cobol tech. suport says ... LOCK MODE MANUAL WITH LOCK ON MULTIPLE RECORDS ...
    (comp.lang.cobol)
  • Re: out-of-process servers -- how do I force thread separation?
    ... lockbegins a communication session & ensures exclusive access until ... a corresponding unlock() call. ... lock() will return FALSE. ... Clients should be allowed to ...
    (microsoft.public.vc.atl)
  • Re: 2.6.14-rt4: via DRM errors
    ... >> I made a fix to the locking code in main drm a couple of months ago. ... > DRM lock or several, what kind of lock it is or what it's protecting ... > DRM clients to keep from stepping on each other? ...
    (Linux-Kernel)
  • Re: Avoiding deadlocks in concurrent programming
    ... The backup thread only holds the lock long enough to create an ... in-memory representation of the data. ... It writes to disk on it's own ... time after it has released the lock, so this is not an issue. ...
    (comp.lang.python)