Re: EEPROM guarantees after power loss during a write



John Devereux wrote:

update():

a) mark copy 1 invalid
b) write new copy 1
c) mark copy 1 valid

[same again for copy 2]

startup(): any copy marked invalid is replaced by the copy marked valid.

The steps happen in strict order. Each previous step must complete
successfully before the next is started. So the only way the valid
flag can be set is if the data has been successfuly written, without
interruption.

On what basis you would come to know data is valid given that you dont
have a checksum?

The data is marked valid only *after* it has been successfully
written. If writing of data is interrupted, then the flag never set
either. So next time it powers up we know that copy may be bad, and
restore from the good one.

There is always at least one good copy.

Let us look at what happens if programming is interrupted during a,b,d
above.

a) The copy 1 valid *flag* is left in an unknown state. But the actual
data is valid. So either the startup will see it invalid and restore
the data, or it sees it valid and all is OK.

b) The data is marked invalid, and the *data* is left in an unknown
state. This is OK, the startup will see the invalid flag and restore
the data.

c) The data has been correctly written, but the valid flag is left in
an unknown state. If the startup sees the flag as valid, that is OK,
because the data is in fact valid. If it sees it as invalid, the data
will be restored from the other copy. Still OK.

Obviously this make a few assumptions: the eeprom has not worn out,
and that there is some brownout protection so that the CPU does not go
crazy and erase everything.

Another assumption is that the flags are either programmed or not
programmed. But what if the flag programming gets interrupted so that
the flag state is not only unknown, but is actually *unreliable*. That
is, it is only "half programmed" (or half erased), so sometimes reads
"valid" and sometimes "invalid"? In this condition the state read
could depend on temperature,age or supply noise.

It would require a very unlikely sequence of events, but you could
have:

update()
...
mark copy 2 invalid
write copy 2
mark copy 2 valid <interrupted>

Then on power up, copy 2 valid flag is unreliable. But at startup
happens to read OK.

Then next time we do an update, we get *another* power cut, this time
during copy 1 update. And at power up, this time copy 2 reads
*invalid*. So we have no valid copies.

I think the solution is to reprogram the "valid" flags every startup.

I am sorry if these questions look amature,I am trying to understand
it and felt your algorithm is more simpler then mine except for extra
memory needed for having copies.

I find it a difficult area, too. (And it gets harder if you start
thinking about wear-levelling or if you don't want to allocate a whole
page to a record, or if the record does not fit in a single page...)


A better method is to have a version stamp along with your data. You have two blocks, each structured as "version stamp, data". At startup, you verify each block based on having a valid version (and possibly a checksum as well, if you are particularly paranoid). The latest valid version shows which block you use as your data.

For an update, you erase the block containing the older version of the data. Then you save your data to this block, then you write your new version stamp. There is no need to write your data a second time - it gives no advantages, and halves your eeprom/flash life expectancy.
.



Relevant Pages

  • Re: EEPROM guarantees after power loss during a write
    ...   mark copy 1 invalid ... On power up both copy valid flags would be checked, ... flag can be set is if the data has been successfuly written, ...
    (comp.arch.embedded)
  • Re: EEPROM guarantees after power loss during a write
    ... mark copy 1 invalid ... any copy marked invalid is replaced by the copy marked ... have two blocks, each structured as "version stamp, data". ... startup, you verify each block based on having a valid version (and ...
    (comp.arch.embedded)
  • Re: EEPROM guarantees after power loss during a write
    ...   mark copy 1 valid ...   mark copy 2 invalid ... a) mark copy 1 invalid ... flag can be set is if the data has been successfuly written, ...
    (comp.arch.embedded)
  • RE: STARTUP.LOG?
    ... > And if you try the 20000 flag and still dont get enough information, ... >>> Is there a way to get the startup sequence before ... During the hanging, no ... same batch job could then move the log file off the system disk if need ...
    (comp.os.vms)
  • Explorer causing errors; need to reinstall W98?
    ... Under OS 10.3.3 in Virtual PC 6.1.1, I get error on startup that reads, ... "ScanDisk detected an invalid long filename entry on this drive but was ... I hadn't used VPC since then. ... Today I slowly dragged each prep test file and folder I could find to the ...
    (microsoft.public.mac.virtualpc)