Re: NAND flash misery
- From: David Brown <david@xxxxxxxxxxxxxxxxxxxxxxxxxxxxx>
- Date: Sat, 28 Jun 2008 18:08:48 +0200
Vladimir Vassilevsky wrote:
"David Brown" <david@xxxxxxxxxxxxxxxxxxxxxxxxxxxxx> wrote in message
news:486611d3$0$14988$8404b019@xxxxxxxxxxxxxxxxxx
Vladimir Vassilevsky wrote:Guess how many bad blocks are typical for NAND flash of several GBNAND flash always has defects in manufacturing - the devices are
capacity? As many as 2 percent! There could be the whole areas of
hundreds of megabytes of the contiguous bad cells, as well as the random
scatter.
It is possible to do the extensive read/write test to find the most of
the unreliable blocks; but it takes many hours.
I didn't encounter this problem until we started to use the high
capacity CF cards. The bad blocks were very rare for the cards of 1GB
and below. Since the flash iself is hidden behind the IDE interface and
a compatible file system, and the read/write performance is critical, it
is generally impossible to apply an error correction scheme.
I was under impression that flash is more reliable then HDD; now I see
that it is not so. Do you know how reliable are the IDE flash drives?
designed to cope with a certain level of faults to make manufacturing
cheaper (the same applies to many other types of chips, and hard disks).
Each sector in NAND has extra space for error correction and detection
(IIRC, 512 byte sectors are actually 528 bytes in size). Bad blocks can
be detected and marked during manufacture and testing, and blocks that
go bad (due to wearing out) are detected in use and the data moved to
different blocks.
The utterly bad blocks are detected at manufacturing; however there is a
bunch of the unreliable blocks which takes hours of testing to discover. If
the bad block is detected when in use, it means that the data is lost
already. It is too late to hide it by remaping.
The point of ECC - Error checking and *correcting* - is that slightly bad blocks do not lead to lost data. The most common problem in flash blocks (excluding any totally failed blocks found in manufacturing) are single-bit errors - bits that can't erase or program properly. These single-bit errors do not lead to data loss, and the flash controller can easily detect and correct them. It is even possible that the controller will continue using a block with bad bits, and will not disable the block until a certain number of bad bits have been found (I don't know what error rates are used here in practice).
CF cards and other earlier flash devices are not that great at wear
levelling and bad block handling (that's one of the reasons for
flash-specific file systems like YAFS and JFFS2).
The actual erase block size in NAND flash is something like 32/64/128/256KB,
being bigger for the higher capacity devices. What it implies: any write
operation through the IDE interface is actually read - copy - erase -
modify - write at the controller level. The other surcumstance of that is
the speed penalty for the writes misaligned to the erase block size. Since
this kitchen is hidden behind IDE, there is no point in using YAFS or JFS or
such. A disk cache with the blocks of 32k makes a lot of sense though.
YAFS and JFFS are designed for when you have direct access to the flash - when you are using a controller that handles the wear levelling and block placement (such as for CF cards and IDE/SAS/SATA controllers), you should not use flash-specific file systems of that sort.
.
- References:
- NAND flash misery
- From: Vladimir Vassilevsky
- Re: NAND flash misery
- From: David Brown
- Re: NAND flash misery
- From: Vladimir Vassilevsky
- NAND flash misery
- Prev by Date: Re: LPC2138 I2C
- Next by Date: Re: How workable is Vista?
- Previous by thread: Re: NAND flash misery
- Next by thread: Re: NAND flash misery
- Index(es):
Relevant Pages
|