#12516 closed defect (fixed)

Reimaging with fs-update fails if USB keyboard attached to other XO-4 USB port

Reported by: greenfeld Owned by: Quozl
Priority: normal Milestone: 4-firmware
Component: ofw - open firmware Version: Development build as of this date
Keywords: Cc:
Blocked By: Blocking:
Deployments affected: Action Needed: no action
Verified: no

Description

  1. Attach a USB Keyboard (in this case a Lenovo Thinkpad Keyboard/Touchpad/J-Mouse unit) to the USB port of a SKU 296 unit on the audio jack side of the XO. {A different 296 unit was used than the one with the known runin hub issues.}
  2. Attach a Sandisk 16 GB USB stick to the other.
  3. Power on the XO (with Q7B14) and get into OFW.
  4. Run "fs-update u:\31028o4.zd" and wait a few minutes. You are likely to encounter a data abort before the re-imaging process continues.
  5. Power off the XO and remove the keyboard. Power on the XO and try re-imaging again. Every time I tried without the USB keyboard present the re-imaging succeeded.

Attachments (1)

usbabort.txt (5.0 KB) - added by greenfeld 22 months ago.
Abort log with USB failure

Download all attachments as: .zip

Change History (8)

Changed 22 months ago by greenfeld

Abort log with USB failure

comment:1 Changed 22 months ago by Quozl

Thanks.

I do not have a SanDisk 16 GB USB drive to test with, but I have tested with a SanDisk Cruzer Colors 4 GB USB drive, and a flexible USB keyboard. With a test build that fixes #12466 and avoids heap allocation in the USB keyboard interrupt driver, I have reproduced several symptom patterns:

  • a successful fs-update but with loss of USB keyboard response afterwards,
  • a "Wrong expanded data length",
  • a "Short read of zdata file" followed by loss of USB drive access, and;
  • a hang,

But not your symptom.

Could you please also try with the USB keyboard interrupt driver turned off? (Theory: interference between the keyboard driver and the storage driver.) It fixes the symptom for me, but I'd like to see if your device combination does the same. Here is how I did it:

ok .alarms \ display the scheduled alarms
Action                Ihandle  Interval  Remaining
poll-tty                    0      a         3
get-scan             fd9ffa60      1         1
get-scan             fd9f77d0      a         3
ok usb-keyboard-ih . \ check that the USB keyboard instance handler is in the list
fd9f77d0
ok usb-keyboard-ih iselect ' get-scan 0 alarm iunselect \ remove it
ok .alarms \ check that it was removed
Action                Ihandle  Interval  Remaining
poll-tty                    0      a         9
get-scan             fd9ffa60      1         1
ok fs-update ...

If the above test does not pass without error, could you please also try with an external powered USB hub between the laptop and both USB devices? (Theory: USB power problems.)

comment:2 Changed 22 months ago by greenfeld

Retesting with Q7B14 I saw the "Wrong expanded data length" symptom twice in two tries without disabling the keyboard handler, but not the Data Abort one.

Removing the USB keyboard handler I was able to successfully re-image with the USB keyboard attached twice, power cycling and re-entering the disabling commands in-between.

The layout of USB devices OFW sees I am using is:

/usb@d4208000/hub@0,0
/usb@d4208000/hub@0,0/hub@3,0
/usb@d4208000/hub@0,0/scsi@2,0
/usb@d4208000/hub@0,0/hub@3,0/device@4,1
/usb@d4208000/hub@0,0/hub@3,0/mouse@4,0
/usb@d4208000/hub@0,0/hub@3,0/hid@3,1
/usb@d4208000/hub@0,0/hub@3,0/keyboard@3,0
/usb@d4208000/hub@0,0/scsi@2,0/disk

comment:3 Changed 22 months ago by Quozl

Thanks.

I have tested continuous XO-4 fs-update with the USB keyboard interrupt handler disabled, for 73 hours straight, with no exceptions reported.

Therefore the problem relates to the re-entrancy of the USB stack, or the interrupt handler.

Adding a coarse interlock that prevents execution of the handler when the USB stack is in use by a command does work, but then the USB keyboard is unresponsive at some unusual times, such as the more prompt in the directory command.

I am testing continuous XO-1.5 fs-update with the USB keyboard interrupt handler enabled, with no exceptions reported so far.

Then I shall test XO-1.5 with a hub, in the hope that this is a re-entrancy issue associated with USB hubs alone.

comment:4 Changed 22 months ago by Quozl

I have tested continuous XO-1.5 fs-update with the USB keyboard interrupt handler enabled, for 20 hours straight, with no exceptions reported. A USB keyboard was directly attached.

I have tested XO-1.5 fs-update with a USB keyboard and USB drive attached via a hub. Failures occur, including:

  • a hang 1, 2, (keyboard works, drive doesn't work, USB ERROR),
  • a Page Fault (keyboard works, drive works),
  • a Bad hash for eblock#, and
  • two instances of Short read of zdata file.

I have tested XO-1.5 fs-update with a USB keyboard attached via a hub, and a USB drive directly attached. Failures occur, including:

  • a hang 3, (keyboard works, drive doesn't work, USB ERROR),

I have tested XO-1.5 fs-update with a USB drive attached via a hub, and a USB keyboard directly attached. No failures occur, over 29 cycles.

I have tested XO-4 fs-update with a USB bar code scanner attached. The scanner appears as a keyboard. No failures occur, over 24 cycles.

Therefore the problem relates to the re-entrancy of the USB stack, or the interrupt handler, when specific models of USB keyboard or bar code scanner are used on XO-1.75 or XO-4, or attached via a hub on XO-1.5.

comment:5 Changed 22 months ago by Quozl

  • Action Needed changed from diagnose to test in build
  • Milestone changed from Not Triaged to 4-firmware

Fixed in svn 3536, which adds an interlock to skip the USB keyboard interrupt handler if the non-interrupt execution path is in the middle of a USB bulk or interrupt pipe method.

Test builds:

  • q7b14jc.rom for XO-4 (testcase: attach USB keyboard directly, attach USB drive directly, see below),
  • q3c10jb.rom for XO-1.5 (testcase: attach USB keyboard via an external hub, attach USB drive directly, see below),

Common testcase conclusion ... do repeated fs-update, it should complete without error and USB keyboard should be functional afterwards.

comment:6 Changed 22 months ago by Quozl

Passed 268 cycles on XO-4, and 271 cycles on XO-1.5.

(A flexible USB keyboard did stop responding on the XO-4 test, with control-set operations returning USB error code 8 (STALL), but the significance is not yet clear).

comment:7 Changed 22 months ago by Quozl

  • Action Needed changed from test in build to no action
  • Resolution set to fixed
  • Status changed from new to closed

Passed 981 cycles on XO-4. Is in Q7B15.

Note: See TracTickets for help on using tickets.