Re: The reliability of python threads
- From: Steve Holden <steve@xxxxxxxxxxxxx>
- Date: Tue, 30 Jan 2007 05:14:33 +0000
Carl J. Van Arsdall wrote:
Aahz wrote:Are you using memory with built-in error detection and correction?[snip]Well, I assume that I have done all I can to make it reliable. This list is usually my last resort, or a place where I come hoping to find ideas that aren't coming to me naturally. The only other thing I thought to come up with was that there might be network errors. But i've gone back and forth on that, because TCP should handle that for me and I shouldn't have to deal with it directly in pyro, although I've added (and continue to add) checks in places that appear appropriate (and in some cases, checks because I prefer to be paranoid about errors).
My response is that you're asking the wrong questions here. Our database
server locked up hard Sunday morning, and we still have no idea why (the
machine itself, not just the database app). I think it's more important
to focus on whether you have done all that is reasonable to make your
application reliable -- and then put your efforts into making your app
recoverable.
I'm particularly making this comment in the context of your later pointRight, I wasn't coming here to get someone to debug my app, I'm just looking for ideas. I constantly am trying to find new ways to improve my software and new ways to reduce bugs, and when i get really stuck, new ways to track bugs down. The exception won't mean much, but I can say that the error appears to me as bad data. I do checks prior to performing actions on any data, if the data doesn't look like what it should look like, then the system flags an exception.
about the bug showing up only every three or four months.
Side note: without knowing what error messages you're getting, there's
not much anybody can say about your programs or the reliability of
threads for your application.
The problem I'm having is determining how the data went bad. In tracking down the problem a couple guys mentioned that problems like that usually are a race condition. From here I examined my code, checked out all the locking stuff, made sure it was good, and wasn't able to find anything. Being that there's one lock and the critical sections are well defined, I'm having difficulty. One idea I have to try and get a better understanding might be to check data before its stored. Again, I still don't know how it would get messed up nor can I reproduce the error on my own.
Do any of you think that would be a good practice for trying to track this down? (Check the data after reading it, check the data before saving it)
regards
Steve
--
Steve Holden +44 150 684 7255 +1 800 494 3119
Holden Web LLC/Ltd http://www.holdenweb.com
Skype: holdenweb http://del.icio.us/steve.holden
Blog of Note: http://holdenweb.blogspot.com
See you at PyCon? http://us.pycon.org/TX2007
.
- References:
- The reliability of python threads
- From: Carl J. Van Arsdall
- The reliability of python threads
- Prev by Date: Re: log parser design question
- Next by Date: Re: pdf to text
- Previous by thread: Re: The reliability of python threads
- Next by thread: Re: The reliability of python threads
- Index(es):
Relevant Pages
|