2010-09-08 23:54:44 Write different ECC format to device
Reggy Perrin (UNITED STATES)
As a follow-up to my earlier post, we are now in a position where we would like to reflash the NAND u-boot partition from the device (in the field). Unfortunately, we still have the 2 ECC formats in use: u-boot in boot ECC format, kernel/rootfs in regular format.
To attempt to do this upgrade, I was experimenting with these ideas. First, get a binary of the new uboot with ECC info directly from the device:
nanddump -f uboot.bin -n -l 262144 -s 0 /dev/mtd0
In theory, I felt this should dump the current u-boot to a binary file with the OOB info (ECC). My thought was I could then write that binary with the OOB info directly back to the NAND:
nandwrite -o -s 0 /dev/mtd0 uboot.bin
Unfortunately, that gives this error:
Writing data to block 0 at offset 0x0
Bad block at 0, 1 block(s) from 0 will be skipped
Writing data to block 1 at offset 0x20000
Bad block at 20000, 1 block(s) from 20000 will be skipped
Writing data to block 2 at offset 0x40000
ioctl(MEMGETBADBLOCK): Invalid argument
Data was only partially written due to error
: Invalid argument
My next though was maybe I could use dd:
dd if=uboot.bin of=/dev/mtd0 bs=270336 count=1
1+0 records in
0+1 records out
Any suggestions (other than I should have followed Mike's directions and unified the ECC when I had the chance)?
2010-09-10 19:40:08 Re: Write different ECC format to device
Mike Frysinger (UNITED STATES)
your dumping problem isnt related to ECC, it's bad block marking. while these both use the OOB area, they are independent issues. as documented, neither locations match between the two OOB layouts (bootrom and default Linux layout), so Linux will consider all pages to be bad before it even gets a chance to check the ECC values.
in this case, it seems like --ignoreerrors option would be useful, but if you read nanddump.c, you'll see that this option never actually gets used. so we need to fix that first. see simple attached patch.
then in terms of restoring it, you'll see that nandwrite.c lacks any sort of ignore option, and so it'll always skip the "bad" blocks. adding it though is also easy, so see the second attached patch.
not that this helps. NAND, like any flash, needs to be erased before it can be written. and the kernel will check to see if the block is bad before it tries to erase it. which will read the OOB layout incorrectly and think all blocks are bad. which means you cant erase the blocks. i dont think there is a way to tell the kernel to attempt the erase anyways (like u-boot's "scrub" option) from kernel space let alone user space.
presumably your NAND flash is also your rootfs which means you have no way of unloading the MTD layer and replacing it with a tweaked one which allows erasing of bad blocks (or even a fixed Blackfin NFC driver).
if you're feeling adventurous, about the only thing that will work without fixing the actual problem (having Linux use the correct OOB layout), is to patch the kernel at runtime. you'll either need /proc/kallsyms, or the System.map that matches your running kernel.
look up the address of nand_block_bad and simply write 0x106000 to that address (r0=0;rts;). see the attached patcher.c as an example of the write.
root:/> grep nand_block_bad /proc/kallsyms
0010143c t _nand_block_bad
now the nand_block_bad should always return 0 to mean "block is not bad". and you can safely `flash_eraseall` followed by `nandwrite`.
although, if you're feeling this adventurous, you could even look up the symbol of nand_oob_128 (or whatever nand_oob is being used by default for your processor) and rewrite the contents to match linux-2.6.x/drivers/mtd/nand/bf5xx_nand.c:bootrom_ecclayout. and then do the same for largepage_flashbased (set it to the same contents as bootrom_bbt). then you should be able to use all of the standard MTD utils and have everything "just work".
i wonder if adding an option to ldr-utils so that it'd generate a LDR with the OOB data inline would be useful ...