Ticket #237 (closed defect: fixed)

Opened 3 years ago

Last modified 3 years ago

nandflash writes fail with OFW

Reported by: cjb Owned by: dwmw2
Priority: blocker Milestone: BTest-1
Component: kernel Version:
Keywords: Cc:
Action Needed: Verified: yes
Deployments affected: Blocked By:
Blocking:

Description

This took a long time to track down. :/

dmesg attached. OFW boots the NAND flash, but writes start failing almost immediately and soon jffs2 panics. This was 100% reproducible with both usedma=1 and 0; it wasn't working for hours, I flashed from OFW back to LaB, it worked straight away.

I'm having trouble explaining it. I *wrote* the image to the nandflash inside OFW, so it's not that all writes from inside OFW are failing, but most writes are. The video output was full of things like "{chown,touch,rm}: I/O error".

Attachments

nand-dmesg (38.8 kB) - added by cjb 3 years ago.

Change History

Changed 3 years ago by cjb

Changed 3 years ago by blizzard

  • owner changed from blizzard to dcbw

Changed 3 years ago by jg

  • owner changed from dcbw to wmb@…

Changed 3 years ago by cjb

I should be clear about my testcase:

* flash_eraseall -j /dev/mtd0
* nandwrite -p /dev/mtd0 olpc-redhat-stream-development-build-130-20061026_0003-devel_jffs2.img
* reboot

.. it will start failing shorting after "Starting udev.." and panic() before showing the login prompt.

Build 130 is at:

http://olpc.download.redhat.com/olpc/streams/development/build130/devel_jffs2/olpc-redhat-stream-development-build-130-20061026_0003-devel_jffs2.img

Since OFW can't currently boot flash unmodified, I was doing (from memory rather than scrollback):

[interrupt autoboot]
setenv boot-disk /nandflash:\boot\vmlinuz
setenv boot-file ... root=mtd0 rootfstype=jffs2 ...
boot

Changed 3 years ago by wmb@…

  • owner changed from wmb@… to dwmw2
  • component changed from distro to kernel

As David W. suspected, the problem is that the OFW driver was leaving the CaFe NAND controller in the "write protect" state. The fix was trivial - just hit the soft-reset bit in the OFW driver close routine. I've made that change.

The Linux driver really should do the soft-reset too, in its initialization, thus ensuring a predictable init no matter what state the hardware is in. This also has the advantage of making sure the DMA engine is in a clean state.

The way to perform this reset is to first write 0x01, then 0x00 to the 32-bit chip register at offset 0x3034. You can just write that bit, no need to read/modify/write.

I'm reassigning to David to get the change into Linux too.

Changed 3 years ago by dwmw2

  • status changed from new to closed
  • resolution set to fixed

Should be fixed in olpc-2.6 git tree. Will transfer to Fedora package later today after testing.

Thanks for the extremely coherent instructions.

Note: See TracTickets for help on using tickets.