We are using Linux-2.6.28 and 2 Gb NAND Flash in our system ; After some amount of power cycle tests we are observing the following errors :
Volume operational found at volume id 3
read 21966848 bytes from volume 3 to 80400000(buf address)
UBI error: ubi_io_read: error -77 while reading 126976 bytes from PEB 1074:4096, read 126976 bytes
UBI: force data checking
UBI error: ubi_io_read: error -77 while reading 126976 bytes from PEB 1074:4096, read 126976 bytes
UBI warning: ubi_eba_read_leb: CRC error: calculated 0xa7cab743, must be 0x15716fce
read err ffffffb3
These errors are not hardware errors as if we remove the offending partition, we are able to boot the hardware fine; Maybe UBIFS is not correcting the bad UBI block.
Any UBI patches have been added in the latest kernels to address this issue ? Thanks.
The error printed is a UBI error. Lets look at the source near line 177,
So, error '-77' (normally -EBADFD) was returned from the NAND flash driver when trying to read the 'physical erase block' #1074 at offset 4096 (2nd page for 2k pages). UBI include volume management pages which are typically located at the beginning of a physical erase block (PEB for short).
Note that the latest mainline of io.c has the following comment and code,
The following code can be used to process a UBI/UbiFS dump and look for abnormalities,
To build copy ubi-media.h from the UBI directory and run
gcc -Wall -g -o parse_ubi parse_ubi.c
. The code probably has issues on big-endian platforms; it is also not test with 2.6.28 but I believe it should work as the UBI structures shouldn't change. You may have to remove some fastmap code, if it doesn't compile. The code should give some indication on what is wrong with PEB#1074. Make a copy of the partition when failing and use the code above to analyze the UBI layer.It is quite possible that the MTD driver does something abnormal which prevents UBI from attaching to an MTD partition. This in-turn prevents UbiFS from mounting. If you know what MTD Nand flash controller is being used, it would help others determine where the issue is.
It can be caused by MTD bugs and/or hardware bugs or UBI/UbiFS issues. If it is UBI/UbiFs, there are backport trees and newer 3.0. You can try to steal the patches from 2.6.32; after applying all, add the 3.0.
Again, the issue can be the MTD driver. Grab MTD changes for your particular CPU/SOCs NAND flash controller. I do this from the mainline; some changes are bug fixes and others infra-structure. You have to look at each patch individually