Ticket #12172 (closed defect: fixed)

Opened 23 months ago

Last modified 19 months ago

XO-4 B1 8787 pingd data abort

Reported by: Quozl Owned by: Quozl
Priority: low Milestone: Not Triaged
Component: ofw - open firmware Version: not specified
Keywords: Cc:
Action Needed: no action Verified: no
Deployments affected: Blocked By:
Blocking:

Description (last modified by Quozl) (diff)

XO-4 B1 with 8787 wireless, Open Firmware Q7B03jc, svn 3363.

  • a breakpoint was set in open of /wlan,
  • an essid was set, for an open 802.11g wireless network,
  • pingd was run,
  • in the open breakpoint, debug? was set to true, and execution was allowed to continue,
  • a ping was initiated from a remote host to the IP of the target,
  • another ping was initiated from a different remote host to the IP of the target,

Roughly 104 seconds later, a Data Abort was printed.

           4  5  6  7  8  9  a  b   c  d  e  f  0  1  2  3  456789abcdef0123
fd7f3584  7e 00 00 00 00 00 44 00  36 00 03 00 ea 08 ff 00  ~.....D.6...j...
fd7f3594  27 53 00 24 63 bc aa b5  68 00 00 0e 00 00 00 00  'S.$c<*5h.......
fd7f35a4  00 00 00 00 96 00 08 02  00 00 ff ff ff ff ff ff  ................
fd7f35b4  00 1a 2b 84 de e8 c4 2c  03 12 ff ff ff ff ff ff  ..+.^hD,........
fd7f35c4  c4 2c 03 12 ce d4 00 36  aa aa 03 00 00 00 08 06  D,..NT.6**......
fd7f35d4  00 01 08 00 06 04 00 01  c4 2c 03 12 ce d4 0a 00  ........D,..NT..
fd7f35e4  00 02 00 00 00 00 00 00  0a 00 00 a1 00 00 00 00  ...........!....
fd7f35f4  00 00 00 00 00 00 00 00  00 00 00 00 00 00 56 8b  ..............V.
           4  5  6  7  8  9  a  b   c  d  e  f  0  1  2  3  456789abcdef0123
fd7f3bc4  a4 00 00 00 00 00 6a 00  36 00 03 00 f3 08 ff 00  $.....j.6...s...
fd7f3bd4  2a 53 00 24 37 80 b1 b5  68 00 00 1c 00 00 00 00  *S.$7.15h.......
fd7f3be4  00 00 00 00 bc 00 08 02  3a 01 20 68 9d c1 aa 9d  ....<...:. h.A*.
fd7f3bf4  00 1a 2b 84 de e8 f8 d1  11 10 20 68 9d c1 aa 9d  ..+.^hxQ.. h.A*.
fd7f3c04  f8 d1 11 10 72 6c 00 5c  aa aa 03 00 00 00 08 00  xQ..rl.\**......
fd7f3c14  45 00 00 54 00 00 40 00  40 01 26 41 0a 00 00 01  E..T..@.@.&A....
fd7f3c24  0a 00 00 68 08 00 27 6b  6d a4 07 53 9a 82 77 50  ...h..'km$.S..wP
fd7f3c34  57 c7 07 00 08 09 0a 0b  0c 0d 0e 0f 10 11 12 13  WG..............
fd7f3c44  14 15 16 17 18 19 1a 1b  1c 1d 1e 1f 20 21 22 23  ............ !"#
fd7f3c54  24 25 26 27 28 29 2a 2b  2c 2d 2e 2f 30 31 32 33  $%&'()*+,-./0123
fd7f3c64  34 35 36 37 00 00 00 00  00 00 00 00 00 00 00 00  4567.............Data Abort
ok ftrace
clean-d$-entry    Called from flush-d$-range            at  fda36e34
   Do loop frame inside flush-d$-range   i: 3f000000   limit: fd7f4280 
flush-d$-range    Called from dma-map-in                at  fda5d5ec
execute           Called from $call-self                at  fda2edcc
$call-self        Called from $call-method              at  fda2eeac
 fd9fd720
$call-method      Called from $call-parent              at  fda2eed4
$call-parent      Called from (dma-setup)               at  fdaa0c30
(dma-setup)       Called from iodma-setup               at  fdaa0ce8
iodma-setup       Called from r/w-ioblocks              at  fdaa3b5c
        0
execute           Called from $call-self                at  fda2edcc
$call-self        Called from $call-method              at  fda2eeac
 fd9f73d8
$call-method      Called from $call-parent              at  fda2eed4
$call-parent      Called from r/w-ioblocks              at  fdaa47bc
execute           Called from $call-self                at  fda2edcc
$call-self        Called from $call-method              at  fda2eeac
 fd9d70dc
$call-method      Called from $call-parent              at  fda2eed4
$call-parent      Called from (sdio-blocks!)            at  fdaa6004
(sdio-blocks!)    Called from packet-out-async          at  fdaa6054
packet-out-async  Called from data-out                  at  fdaa62ec
data-out          Called from write-force               at  fdaace20
write-force       Called from write                     at  fdaad554
execute           Called from $call-self                at  fda2edcc
$call-self        Called from $call-method              at  fda2eeac
 fd9d40bc
$call-method      Called from $call-parent              at  fda2eed4
$call-parent      Called from send-ethernet-packet      at  fda423e0
send-ethernet-packet  Called from send-link-packet      at  fda42500
execute           Called from $call-self                at  fda2edcc
$call-self        Called from $call-method              at  fda2eeac
        0
$call-method      Called from $call-net                 at  fda6a260
$call-net         Called from send-packet               at  fda6aa30
send-packet       Called from echo-packet               at  fda6aa60
echo-packet       Called from ?echo-packet              at  fda6aa8c
?echo-packet      Called from handle-requests           at  fda6aad8
handle-requests   Called from pingd                     at  fda6ab08
execute           Called from interpret-do-defined      at  fda0a524
do-defined        Called from $compile                  at  fda0a4c8
$compile          Called from (interpret                at  fda0a78c
   Catch frame - SP: fdbfbfac   my-self: 0   handler: fdbfb69c 
catch             Called from (interact)                at  fda0f21c
(interact)        Called from interact                  at  fda0f278
       32
    43478
       32
        0
        5
        0
interact          Called from (quit)                    at  fda0f2c0
ok 

http://dev.laptop.org/~quozl/z/1TMVQx.txt has serial log of a second instance.

Change History

Changed 23 months ago by Quozl

  • next_action changed from reproduce to diagnose
  • description modified (diff)

Changed 23 months ago by Quozl

  • priority changed from normal to low

The same problem occurs, with the same ftrace, without debug? enabled, if the inbound packet rate exceeds the capability of the device and a queue develops.

# ping -i 0.005 -s 1300 ${target}

Therefore this problem might only occur with protocols that apply no flow control. We use none of these protocols in Open Firmware deployment scenarios.

Changed 19 months ago by Quozl

Still reproduces, on a slightly attenuated 802.11g network, but not on an unattenuated network.

In the ftrace, the flush-d$-range do loop index variable is meant to be a virtual address, but is being found set to 7f000000 or other values which are more consistent with a physical address in the dma-mem-va >physical range.

Changed 19 months ago by Quozl

  • next_action changed from diagnose to add to build

Fixed in svn 3537.

A pause in the dots before the Data Abort was found to be flush-d$-range iterating over available memory.

The parameter stack values given to the flush specified a virtual address of h# 0000.0552, which was the encapsulated packet size of 1362 bytes, the ping being for 1300 bytes.

In the SDHCI instance, dma-len was an address and should not have been, and dma-vadr was not an address and should have been, and the io-#blocks count was negative.

This pointed at stack imbalance. Code was added to implement stack cookies and saved values at various points in the call chain.

The problem was found to be in the ring buffer code, which left an item on the stack when the ring buffer was first full.

Changed 19 months ago by Quozl

  • status changed from new to closed
  • next_action changed from add to build to no action
  • resolution set to fixed

Is in Q7B15.

Note: See TracTickets for help on using tickets.