Opened 3 years ago

Last modified 3 years ago

#11210 new defect

erase-blocks doesn't work with Toshiba eMMC devices

Reported by: jnettlet Owned by: wad
Priority: low Milestone: 1.75-hardware
Component: hardware Version: 1.75-B1
Keywords: XO-1.75, eMMC Cc: wad
Blocked By: Blocking:
Deployments affected: Action Needed: review
Verified: no

Description

OLPC 1B2, 508 MiB memory installed, 4 GB internal storage, S/N SHC12901A38

The eMMC in this machine no longer passes the self test. From OFW.

ok test int:0
SDHCI: Error: ISR = 8000 ESR = 1 Command Timeout, 
Command reg: 113a Mode reg: 13 Arg reg: 0 
Recent commands (decimal): 17 
Stopping

int:0 selftest failed. Return code = -1

I was working on the graphics driver DMA code and saw filesystem corruption. The kernel code could have caused the corruption which led to the failure. It is possible it could have just been coincidence as well. Regardless this machine has non cleanly shutdown many times while debugging the graphics clock hardware hang and other various bugs.

Change History (8)

comment:1 Changed 3 years ago by jnettlet

After another reboot the eMMC came back. It is this block of code that busies out the eMMC.

       \ XO-1.5 complete erasure of storage
       \ WARNING: practically irreversible
       open-nand
       " size" $call-nand d# 512 um/mod nip 0 swap
       " erase-blocks" $call-nand
       close-nand

after erase-blocks is called any further tests on int:0 result in the above behavior.

comment:2 Changed 3 years ago by jnettlet

Tested on my 1.75 A3 that is not in use and the above code snippet succeeds and zeros out the eMMC.

comment:3 Changed 3 years ago by wad

The A3 all used Sandisk eMMC. This bug appears on laptops with the Toshiba eMMC devices.

comment:4 Changed 3 years ago by Quozl

Tested erase-blocks in XO-1.75 B1 with board revision 1B2. This system does not exhibit the failure of "test int:0" normally.

The error generated by erase-blocks is not the same as in the ticket description, in that the failing command is ERASE_WR_BLK_START (CMD 32), which is the first of three commands in the erase sequence, and is the command that specifies the starting block.

After this failure, no further commands seem to be possible, the timeout condition is persistent, until power cycle.


@jnettlet, if the "test int:0" still fails after a power cycle, with a "Recent commands (decimal): 17", then I agree that the device is faulty. CMD17 is READ_SINGLE_BLOCK, and the argument of zero means it is the first block of the device.

Just wondering; what method have you been using to force a shutdown during graphics driver development?

comment:5 Changed 3 years ago by wad

  • Action Needed changed from never set to diagnose
  • Component changed from not assigned to ofw - open firmware
  • Keywords XO-1.75 added; hardware failure removed
  • Milestone changed from 1.75-hardware to 1.75-firmware
  • Owner set to wmb@…
  • Summary changed from eMMC dead in XO 1.75 B1 to erase-blocks doesn't work with Toshiba eMMC devices

Confirmed that it is possible to trivially wedge a Toshiba eMMC using the above code snippet.

This problem goes away with a reboot, so it isn't an "eMMC dead" problem, but rather an "erase-blocks method of nand doesn't work w. Toshiba eMMC devices"

comment:6 Changed 3 years ago by Quozl

  • Owner changed from wmb@… to Quozl
  • Status changed from new to assigned

Wad, what do you want OpenFirmware to do about this? We don't use erase-blocks except as part of diagnosis. fs-update does not erase.

comment:7 Changed 3 years ago by martin.langhoff

  • Cc wad added
  • Priority changed from normal to low

Split off failure analysis of Jon's unit to ticket #11276 .

Keeping this ticket to track action around the "erase-blocks doesn't work with Toshiba eMMC devices" issue, which seems low pri.

comment:8 Changed 3 years ago by Quozl

  • Action Needed changed from diagnose to review
  • Component changed from ofw - open firmware to hardware
  • Milestone changed from 1.75-firmware to 1.75-hardware
  • Owner changed from Quozl to wad
  • Status changed from assigned to new
Note: See TracTickets for help on using tickets.