Ticket #9611 (closed defect: fixed)

Opened 4 years ago

Last modified 4 years ago

Page fault on USB activity

Reported by: wmb@… Owned by: wmb@…
Priority: normal Milestone:
Component: ofw - open firmware Version: Development build as of this date
Keywords: Cc:
Action Needed: test in release Verified: no
Deployments affected: Blocked By:
Blocking:

Description

This is very difficult to reproduce - I have seen it once in many hundreds of reboots - but fortunately I did a deep post-mortem when it happened. I think I have found the root cause. More details later...

USB2 devices:
/pci/usb@10,4/scsi@1,0
/pci/usb@10,4/scsi@1,0/disk
USB1 devices:
OLPC D1, 1 GiB memory installed, S/N SHF12345678
OpenFirmware  CL1   Q3A15c Q3A

Type the Esc key to interrupt automatic startup
Page Fault
ok ftrace
l@                Called from remove-qh                 at  ff8e95c8
remove-qh         Called from done-bulk-out             at  ff8ebcd0
done-bulk-out     Called from bulk-out                  at  ff8ebdf4
execute           Called from $call-self                at  ff82a80c
$call-self        Called from $call-method              at  ff82a8ec
 ff9d45d8
$call-method      Called from $call-parent              at  ff82a914
$call-parent      Called from bulk-out                  at  ff8f5874
bulk-out          Called from (execute-command)         at  ff8f6538
(execute-command)  Called from execute-command          at  ff8f6700
        0
execute-command   Called from retry-command?            at  ff8f6fe4
retry-command?    Called from no-data-command           at  ff8f7120
execute           Called from $call-self                at  ff82a80c
$call-self        Called from $call-method              at  ff82a8ec
 ff9d4580
$call-method      Called from $call-parent              at  ff82a914
$call-parent      Called from no-data-command           at  ff8f79bc
no-data-command   Called from unit-ready?               at  ff8f7dc8
unit-ready?       Called from open                      at  ff8f8168
execute           Called from $vexecute?                at  ff82841c
   Catch frame - SP: ff9fcf58   my-self: ff9d4580   handler: ff9fcbd8
catch             Called from apply-method              at  ff82af88
apply-method      Called from (apply-method)            at  ff82b2f8
(apply-method)    Called from (open-node)               at  ff82b32c
(open-node)       Called from open-node                 at  ff82b370
open-node         Called from (open-dev)                at  ff82b5b8
   Catch frame - SP: ff9fcf9c   my-self: 0   handler: ff9fcbdc

Change History

Changed 4 years ago by Quozl

  • milestone changed from Not Triaged to 1.5-F11

Changed 4 years ago by wmb@…

  • next_action changed from code to add to release

Fixed by svn 1459. This is a theoretical fix, but I'm pretty sure that it is correct. Reproducing the failure condition is difficult because it requires inducing a stall on a USB bulk out pipe. I'm not sure how to do that on purpose, and it happens quite rarely under normal conditions.

The fix first appears in q3a15c.rom.

Changed 4 years ago by Quozl

reviewed svn 1459, no issues.

(my-{qh,qtd} avoided in favour of new my-bulk-{qh,qtd}, no use of the old name remained, per dev/usb2/hcd/ehci/bulk.fth)

Changed 4 years ago by wmb@…

  • status changed from new to assigned
  • next_action changed from add to release to test in release

This fix is in q3a16. Quozl, I think you can just close it, as duplicating the problem is very difficult. The problem has not recurred in any of my test builds that incorporate the change.

Changed 4 years ago by Quozl

  • status changed from assigned to closed
  • resolution set to fixed

Changed 4 years ago by anonymous

  • milestone deleted

Milestone 1.5-software deleted

Note: See TracTickets for help on using tickets.