Re: The reliability of python threads



In article <mailman.3166.1169752349.32031.python-list@xxxxxxxxxx>,
Carl J. Van Arsdall <cvanarsdall@xxxxxxxxxx> wrote:
Aahz wrote:

My response is that you're asking the wrong questions here. Our database
server locked up hard Sunday morning, and we still have no idea why (the
machine itself, not just the database app). I think it's more important
to focus on whether you have done all that is reasonable to make your
application reliable -- and then put your efforts into making your app
recoverable.

Well, I assume that I have done all I can to make it reliable. This
list is usually my last resort, or a place where I come hoping to find
ideas that aren't coming to me naturally. The only other thing I
thought to come up with was that there might be network errors. But
i've gone back and forth on that, because TCP should handle that for me
and I shouldn't have to deal with it directly in pyro, although I've
added (and continue to add) checks in places that appear appropriate
(and in some cases, checks because I prefer to be paranoid about errors).

My point is that an app that dies only once every few months under load
is actually pretty damn stable! That is not the kind of problem that
you are likely to stimulate.

I'm particularly making this comment in the context of your later point
about the bug showing up only every three or four months.

Side note: without knowing what error messages you're getting, there's
not much anybody can say about your programs or the reliability of
threads for your application.

Right, I wasn't coming here to get someone to debug my app, I'm just
looking for ideas. I constantly am trying to find new ways to improve
my software and new ways to reduce bugs, and when i get really stuck,
new ways to track bugs down. The exception won't mean much, but I can
say that the error appears to me as bad data. I do checks prior to
performing actions on any data, if the data doesn't look like what it
should look like, then the system flags an exception.

The problem I'm having is determining how the data went bad. In
tracking down the problem a couple guys mentioned that problems like
that usually are a race condition. From here I examined my code,
checked out all the locking stuff, made sure it was good, and wasn't
able to find anything. Being that there's one lock and the critical
sections are well defined, I'm having difficulty. One idea I have to
try and get a better understanding might be to check data before its
stored. Again, I still don't know how it would get messed up nor can I
reproduce the error on my own.

Do any of you think that would be a good practice for trying to track
this down? (Check the data after reading it, check the data before
saving it)

What we do at my company is maintain log files. When we think we have
identified a potential choke point for problems, we add a log call.
Tracking this down will involve logging the changes to your data until
you can figure out where it goes wrong -- once you know where it goes
wrong, you have an excellent chance of figuring out why.
--
Aahz (aahz@xxxxxxxxxxxxxxx) <*> http://www.pythoncraft.com/

"I disrespectfully agree." --SJM
.



Relevant Pages

  • Re: The reliability of python threads
    ... not just the database app). ... not much anybody can say about your programs or the reliability of ... I constantly am trying to find new ways to improve my software and new ways to reduce bugs, and when i get really stuck, new ways to track bugs down. ...
    (comp.lang.python)
  • Re: The reliability of python threads
    ... not just the database app). ... not much anybody can say about your programs or the reliability of ... I constantly am trying to find new ways to improve my software and new ways to reduce bugs, and when i get really stuck, new ways to track bugs down. ...
    (comp.lang.python)
  • Re: Portable Database Choice
    ... I searched this group quite a bit looking for database alternatives and did find the options below from this search. ... I'm posting this in the hope it can be of use to other developers in a position similar to mine where I needed a low cost alternative to Pocket Access. ... One app requires synchronization between desktop and mobile device, the other requires a push of data from the desktop to mobile. ...
    (microsoft.public.dotnet.framework.compactframework)
  • Re: modularity... (was: Re: Looking for real world examples to explain the difference between proced
    ... In the end software quality is a business decision, ... The level of reliability is one of the most important business decisions ... for many types of app, the occasional crash is merely an inconvinience, and ... important decision to override the process (Engineering did not do its ...
    (comp.object)
  • Re: Portable Database Choice
    ... > database alternatives and did find the options below from this search. ... One app requires ... > push of data from the desktop to mobile. ... > Both of these apps used Pocket Access on the device with Peter Foot's ...
    (microsoft.public.dotnet.framework.compactframework)