Re: How to make sure my in-memory db does not flush corrupt page to disk?



On Jun 30, 10:07 pm, Sune <sune_ahlg...@xxxxxxxxxxx> wrote:

<rant>
I was reading this thread *precisely* when an
MS-Windows bug involving concurrency forced
me to reboot and waste time. The MS bug served
as a reminder that proper handling of shared
resources is the software hallmark which distinguishes
the excellent from the contemptible.

In any good Unix system, one can delete files at any
time, etc. One of my tasks in a previous life was to do
absurd things to Unix, *trying* to make it crash, usually
to no avail. I don't even know what the "kernel panic"
message looks like on Linux -- I've never seen it.
Very rarely, when an application is miswritten, I resort
to a "kill -9" or some such, and then the system
continues along fine.

I'm not a "power user" of MS-Windows at all, using it
to browse Internet and do associated tasks.
Yet I experience frequent problems; often need
to try Ctrl-Alt-Del, and often find -- as a moment ago --
that Ctrl-Alt-Del is useless and need to poke the red button.

What makes this all flabbergasting is that Linux
was hacked out quickly by a teenager in Finland.
MS-Windows is produced by a company with GNP
bigger than many countries!
</rant>

... How do I make sure stray pointers in the
application using my db (which will load into application process
memory) does not damage my page(s) from the time of latest checkpoint
and next checkpoint? And if it does, I don't want to flush it to disk,
how do I detect a page corruption to avoid this?

I find this *very* confusing, and the further explications
downthread just made it more confusing. One subthread
dealt with memory ECC(!), in another you distinguish
between pointers that point to shared memory, vs.
"where the application intends" ! One worries that you're
trying to come up with something complicated, and
then make it even more complicated when it doesn't work.

If I had to give you three words of advice, they would be
"Simplicity; simplicity; simplicity".

Unix doesn't achieve its remarkable resilience
through complex race-detecting algorithms, but
through simplicity. When told to remove a link
Unix ... removes it! There's a simple reference counter
to avoid trouble.

I know this advice isn't very specific. One specific
concept which may be useful -- Unix often employs it
in some of its submethods -- is to find a way to tolerate
certain kinds of errors.

1)
Am I simply being paranoid?

Paranoia is definitely a good idea when programming
for concurrent resource sharing. Think in specifics,
think, analyze, rethink.

2)
Should I, for every update, calculate the checksum of the changed
record? Records can be large and this sounds like a pain in the back,
performance-wise.

Calculating a checksum for every disk write *might*
be tolerable performance-wise. But I'm completely
unclear about what problem this addresses.

The checksum does not have to be idiot proof, that is, the checksum
does not have to be unique for the bit pattern of the page, unless
there is a cheap way of achieving this of course, but I don't think
there is...

Checksums are like hash functions; if good a 16-bit
checksum will catch all but 2^-16 of the errors.
But unless you're *very* very* desperate you want
provably correct, not probabilistic.

James Dow Allen

.


Quantcast