Re: File Saving Strategies
From: H. S. Lahman (h.lahman_at_verizon.net)
Date: 09/30/04
- Next message: Universe: "Re: Holistic, Rational, Scientific Development: An Outline"
- Previous message: H. S. Lahman: "Re: C++ design question"
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
Date: Wed, 29 Sep 2004 22:14:45 GMT
Responding to MacDonald...
> First off, let me say that if there is a more appropriate newsgroup to
> post this message, please let me know and I apologize. I am posting
> here because it seems sort of like the right place and I respect what
> I've read here quite a bit.
>
> I'm trying to figure out the most robust strategy possible for saving
> users' files for a desktop application. The goals are (1) that the
> user's data be saved frequently, and (2) if it is corrupted, that
> recent data be recoverable. Hard disk space is not really an issue.
>
> What I've come up with is the following:
>
> (1) The user has a standard file that he can "save" or "save as" at
> any time, like a standard word processing app.
>
> (2) The file is "auto-saved" at a configurable interval (once every
> five minutes --> once an hour) to a different file.
Rather than absolute time, you might consider (minimum time + next
convenient save point). The idea is that incremental saves whose
logical boundaries match the problem, user habits, or storage schemas
may be easier to deal with when trying to recover. In particular, be
wary of data consistency. Your backup may not be useful if it takes a
snapshot of the data before the user has made it fully consistent within
an editing cycle.
You might also want to trigger a save depending on user activity. You
don't want to save files that the user has only been reading, especially
if the application goes into hourglass mode when saving. Similarly, if
the user has been making copious changes, you might want to trigger a
save earlier than the schedule.
>
> (3) The "auto-save" file rotates on a daily basis for five days. I.e.,
> on day one, it saves to file1.dat; on day two, it saves to file2.dat,
> and so on until day six, when it goes back again to file1.dat.
>
> (4) When the program opens, it attempts to open the user's actual
> file. If it cannot, it proceeds through the auto-saved files by date,
> most recent to least recent, until it finds one it can open.
Hopefully it tells the user it is doing this. B-)
>
> (5) In addition, when the user quits, she is prompted to save to a
> backup, where both the binary format of the file and a plain text
> version are saved.
>
> Does this sound good? What can be improved about it?
That depends on what you mean by "most robust strategy possible". B-)
Taken literally, you have no alternative but to have some sort of
continuous save (logging) facility. Otherwise any data between the last
viable save and the system going down or discovering the most recently
stored data was corrupted would be lost. You will probably also need
some sort of transactional logging of user activity independent of the
file saves.
For example, suppose the user is diligently recording last week's
bookings figures for a couple of hundred salespeople. The file backup
is done in the middle of that. The next backup is after all the data is
in but it is later discovered that the second backup was corrupted. How
will the user know where the first backup left off in the data entry?
For a reasonably robust system you will probably need a transactional
log of some sort that can be used by the user. (In fact, that is one
reason why IT is still dominated by transaction processing even though
the days of punched cards and batch processing are long gone.)
Note that corrupted data can be nasty if there is no scheduled check to
see if it is corrupted. I know a Fortune 500 that lost an entire
quarter's financial data because they always used the same tape drive
(ca. '75) for backups and it had been dropping a bit on the head for six
months. They discovered that problem when they lost their online disk.
They even had an offsite backup, but that was just a tape copy of the
original backup.
You should also consider redundant storage to mitigate against corrupted
data. You might want to think about that even for incremental saves.
If you lose the whole disk, it doesn't matter how many times you saved
to it. So you might want backup on physically separate media or a
different network location. The level of paranoia would determine
whether that was an independently scheduled background backup, a literal
parallel save, or something in between.
*************
There is nothing wrong with me that could
not be cured by a capful of Drano.
H. S. Lahman
hsl@pathfindermda.com
Pathfinder Solutions -- Put MDA to Work
http://www.pathfindermda.com
blog (under constr): http://pathfinderpeople.blogs.com/hslahman
(888)-OOA-PATH
- Next message: Universe: "Re: Holistic, Rational, Scientific Development: An Outline"
- Previous message: H. S. Lahman: "Re: C++ design question"
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
Relevant Pages
|