Re: NAND flash misery



Vladimir Vassilevsky wrote:
"David Brown" <david@xxxxxxxxxxxxxxxxxxxxxxxxxxxxx> wrote in message
news:486611d3$0$14988$8404b019@xxxxxxxxxxxxxxxxxx
Vladimir Vassilevsky wrote:
Guess how many bad blocks are typical for NAND flash of several GB
capacity? As many as 2 percent! There could be the whole areas of
hundreds of megabytes of the contiguous bad cells, as well as the random
scatter.

It is possible to do the extensive read/write test to find the most of
the unreliable blocks; but it takes many hours.

I didn't encounter this problem until we started to use the high
capacity CF cards. The bad blocks were very rare for the cards of 1GB
and below. Since the flash iself is hidden behind the IDE interface and
a compatible file system, and the read/write performance is critical, it
is generally impossible to apply an error correction scheme.

I was under impression that flash is more reliable then HDD; now I see
that it is not so. Do you know how reliable are the IDE flash drives?

NAND flash always has defects in manufacturing - the devices are
designed to cope with a certain level of faults to make manufacturing
cheaper (the same applies to many other types of chips, and hard disks).
Each sector in NAND has extra space for error correction and detection
(IIRC, 512 byte sectors are actually 528 bytes in size). Bad blocks can
be detected and marked during manufacture and testing, and blocks that
go bad (due to wearing out) are detected in use and the data moved to
different blocks.

The utterly bad blocks are detected at manufacturing; however there is a
bunch of the unreliable blocks which takes hours of testing to discover. If
the bad block is detected when in use, it means that the data is lost
already. It is too late to hide it by remaping.


The point of ECC - Error checking and *correcting* - is that slightly bad blocks do not lead to lost data. The most common problem in flash blocks (excluding any totally failed blocks found in manufacturing) are single-bit errors - bits that can't erase or program properly. These single-bit errors do not lead to data loss, and the flash controller can easily detect and correct them. It is even possible that the controller will continue using a block with bad bits, and will not disable the block until a certain number of bad bits have been found (I don't know what error rates are used here in practice).

CF cards and other earlier flash devices are not that great at wear
levelling and bad block handling (that's one of the reasons for
flash-specific file systems like YAFS and JFFS2).

The actual erase block size in NAND flash is something like 32/64/128/256KB,
being bigger for the higher capacity devices. What it implies: any write
operation through the IDE interface is actually read - copy - erase -
modify - write at the controller level. The other surcumstance of that is
the speed penalty for the writes misaligned to the erase block size. Since
this kitchen is hidden behind IDE, there is no point in using YAFS or JFS or
such. A disk cache with the blocks of 32k makes a lot of sense though.


YAFS and JFFS are designed for when you have direct access to the flash - when you are using a controller that handles the wear levelling and block placement (such as for CF cards and IDE/SAS/SATA controllers), you should not use flash-specific file systems of that sort.

.



Relevant Pages

  • Re: NAND flash misery
    ... I didn't encounter this problem until we started to use the high capacity CF cards. ... Since the flash iself is hidden behind the IDE interface and a compatible file system, and the read/write performance is critical, it is generally impossible to apply an error correction scheme. ... Each sector in NAND has extra space for error correction and detection. ... The same thing is done with hard disks - the controller detects bad blocks, ...
    (comp.arch.embedded)
  • Re: Seagate shifts B40bn project to Malaysia
    ... War of the Disks: Hard Disk Drives vs. Flash Solid State Disks ... expected in the near future - flash memory cards ?now have enough capacity ... This will reduce the amount of programs in your hard-disk drastical. ...
    (soc.culture.thai)
  • Re: USB drive on server?
    ... as the flash controller is concerned, ... unavailable to the flash controller for wear leveling unless "static" wear ...
    (Fedora)
  • Re: Wird ARM der naechste "Volksprozessor"?
    ... noch ein Controller mit externem DRAM + Flash notwendig war ... da reicht jetzt einer mit internem Flash ... Prev by Date: ... Next by Date: ...
    (de.sci.electronics)
  • Re: Boot from compact flash?
    ... flash memory had increased their ... The controller embedded in the ... >> will boot from that partition then the answer is yes. ...
    (microsoft.public.win32.programmer.kernel)