Ticket #4476 (new defect)

Opened 7 years ago

Last modified 4 years ago

Possible USB problem

Reported by: carrano Owned by: dwmw2
Priority: blocker Milestone: 8.2.0 (was Update.2)
Component: wireless Version:
Keywords: Cc: dilinger, richard
Action Needed: never set Verified: no
Deployments affected: Blocked By:
Blocking:

Description (last modified by kimquirk) (diff)

This is a fork from #4470.

USB bus stops working properly in the SHF73300050 after some hours.

- usb flash drives won't work - marvell chip is isolated (it forward mesh frames but do not work in layer 3)

Change History

  Changed 7 years ago by carrano

  • owner changed from jg to mlj
  • component changed from distro to hardware

follow-up: ↓ 5   Changed 7 years ago by jg

  • owner changed from mlj to wad

Life is much easier if you indicate the build type rather than just the serial number (if you know it); going backwards from serial number.

Note dilinger just fixed another USB driver bug, fixed in 622 and joyride.

  Changed 7 years ago by jg

  • cc dilinger added
  • milestone changed from Never Assigned to XM - killjoy

Andres, is this the USB bug you fixed?

  Changed 7 years ago by kimquirk

  • keywords killjoy? added

in reply to: ↑ 2   Changed 7 years ago by carrano

Replying to jg:

Life is much easier if you indicate the build type rather than just the serial number (if you know it); going backwards from serial number. Note dilinger just fixed another USB driver bug, fixed in 622 and joyride.

(from #4470) C1 - joyride 81 - q2d02 - 19.p0

  Changed 7 years ago by wad

My ECO logs don't show that this unit (SHF73300050) has not been ECO'd, making this a duplicate of a bug that has already been fixed.

I suggest closing this unless: 1.) the unit HAS been ECOd (check under the battery), or 2.) it is reproduced on an ECOd machine.

  Changed 7 years ago by wad

s/has not been/has been/

  Changed 7 years ago by carrano

  • cc richard added

I really think that it's an USB issue. And I believe Richard Smith can confirm that. And if so, should we close this ticket?

follow-up: ↓ 10   Changed 7 years ago by wad

Can you confirm the ECO status by checking under the battery ?

As stated three comments earlier, unless it has been ECOd to C2 level (usually indicated by "All C2 Mods"), it should be closed as a duplicate of 1752.

in reply to: ↑ 9   Changed 7 years ago by carrano

Replying to wad:

Can you confirm the ECO status by checking under the battery ? As stated three comments earlier, unless it has been ECOd to C2 level (usually indicated by "All C2 Mods"), it should be closed as a duplicate of 1752.

There is nothing indicating that this is an ECO'd unit.

  Changed 7 years ago by wad

  • status changed from new to closed
  • resolution set to duplicate

Closing as duplicate of #1752

follow-up: ↓ 14   Changed 7 years ago by kimquirk

  • status changed from closed to reopened
  • description modified (diff)
  • summary changed from Possible USB problem in C1 unit to Possible USB problem
  • priority changed from high to blocker
  • keywords killjoy? removed
  • resolution duplicate deleted

This is not necessarily the same as #1752, so I am reopening this bug. When you have lost WLAN connectivity, AND USB connectivity, then you have this bug, #4476. The radio still forwards traffic (you can only see this by sniffing).

'iwlist scanning' returns no results and 'iwpriv msh0 fwt_time' returns bad address, and there is no acknowledgement when you insert a USB stick.

For clarification: #4470 represents the case when just the 'iwlist' and 'iwpriv' show that you can't talk to the WLAN; but USB continues to work.

  Changed 7 years ago by jcardona

I reproduced this problem on:

build: joyride-251
kernel: http://dev.laptop.org/~dilinger/stable/kernel-2.6.22-20071106.1.olpc.392edb0680e0d8a.i586.rpm
xo: non-ECO'd B4

A good test to characterize this bug:

# echo 0x00020000 > /sys/module/libertas/parameters/libertas_debug 
#echo 9 > /proc/sysrq-trigger

produces this output:

[78928.116648] usb8xxx usbd: 1-1:usb_submit_urb failed
[78928.116666] usb8xxx usbd: 1-1:*** type = 1
[78928.116684] usb8xxx usbd: 1-1:size after = 204
[78928.116702] usb8xxx usbd: 1-1:usb_submit_urb failed
[78928.116720] usb8xxx usbd: 1-1:*** type = 1
[78928.116737] usb8xxx usbd: 1-1:size after = 204
[78928.116755] usb8xxx usbd: 1-1:usb_submit_urb failed
[78928.116774] usb8xxx usbd: 1-1:*** type = 1
[78928.116791] usb8xxx usbd: 1-1:size after = 204
...

I had two xo's with identical software, one of them "ECO'd to C2 circuitry 10/07/07". Only the non-ECO'd xo failed (after 7 hours).

in reply to: ↑ 12 ; follow-up: ↓ 15   Changed 7 years ago by jcardona

Replying to kimquirk:

This is not necessarily the same as #1752, so I am reopening this bug. When you have lost WLAN connectivity, AND USB connectivity, then you have this bug, #4476.

What is the difference between this bug and #1752? If you lose USB connectivity you also lose WLAN connectivity (host to wireless).

The radio still forwards traffic (you can only see this by sniffing).

Note that the capability to forward wireless traffic is not affected by usb problems.

in reply to: ↑ 14   Changed 7 years ago by carrano

Replying to jcardona:

Replying to kimquirk:

This is not necessarily the same as #1752, so I am reopening this bug. When you have lost WLAN connectivity, AND USB connectivity, then you have this bug, #4476.

What is the difference between this bug and #1752? If you lose USB connectivity you also lose WLAN connectivity (host to wireless).

Javier, I think that the point is that there are cases when you cannot scan wireless but the usb bus seem to be working (#4470) and there are cases where you have a usb problem (for instance cannot use a usb stick) on top of #4470, and that is the present (#4476) trac

The radio still forwards traffic (you can only see this by sniffing).

Note that the capability to forward wireless traffic is not affected by usb problems.

Yes, what we see here is that the radio is still working (forwarding frames) but you cannot access it from the host. That's why we believe that the firmware is not crashing.

  Changed 7 years ago by dwmw2

I'd like to see the full SysRq-T output when this situation arises. I suspect a locking issue in the Linux USB stack.

  Changed 7 years ago by dwmw2

There are certainly locking issues with the driver. Here's a deadlock, for example:

[  930.492067] libertas_work S 00000000  2028  1232      2 (L-TLB)
[  930.526352]        cd3a5d44 00000046 00000000 00000000 c051767a 00000001 cd5ca958 00000006
[  930.536108]        cd00f570 47a37787 000000c2 3b9aca00 cd00f690 17b065bb 000000c2 00000001
[  930.573764]        cd5ca000 cae80cdc cd3a5ed0 cd3a5edc d0a6931d cd3a5dcc cd3a5da0 c04df346
[  930.611181] Call Trace:
[  930.669566]  [<d0a6931d>] libertas_send_specific_ssid_scan+0x167/0x1f5 [libertas]
[  930.706080]  [<d0a742db>] assoc_helper_associate+0x353/0x5a2 [libertas]
[  930.741840]  [<d0a75365>] libertas_association_worker+0xe3b/0x10d3 [libertas]
[  930.778442]  [<c041e02c>] run_workqueue+0x93/0x125
[  930.812675]  [<c041e7e3>] worker_thread+0xb7/0xc4
[  930.846737]  [<c0420d1e>] kthread+0x39/0x5f
[  930.880166]  [<c0404017>] kernel_thread_helper+0x7/0x10
[  925.808289] khubd         D C92B34F5  2436    49      2 (L-TLB)
[  925.842365]        c1241dd8 00000046 cc3a0570 c92b34f5 000000cb c12ca1f8 00000003 0000000a
[  925.852132]        cedbb030 cad3d7e8 000000cb 000051e2 cedbb150 c1241e1c c1241e1c cd5ca950
[  925.889709]        c1241e18 c1241e1c c1241df4 c1241e00 c0616ec6 00000001 cedbb030 c040e9ab
[  925.927534] Call Trace:
[  925.986802]  [<c0616ec6>] wait_for_completion+0x6c/0x91
[  926.020926]  [<c041e1a9>] flush_cpu_workqueue+0x4f/0x65
[  926.054676]  [<c041e1f1>] destroy_workqueue+0x32/0x57
[  926.087646]  [<d0a5b87f>] libertas_remove_card+0x17b/0x2d0 [libertas]
[  926.122435]  [<d08a4f0e>] 0xd08a4f0e
[  926.154131]  [<c0545a13>] usb_unbind_interface+0x30/0x72
[  926.187898]  [<c051ab86>] __device_release_driver+0x74/0x90
[  926.221915]  [<c051af6b>] device_release_driver+0x2f/0x45
[  926.255709]  [<c051a51a>] bus_remove_device+0x61/0x6f
[  926.288592]  [<c0518ead>] device_del+0x1d6/0x24c
[  926.320722]  [<c05436e6>] usb_disable_device+0x5f/0xbc
[  926.353103]  [<c05401b3>] usb_disconnect+0x94/0xf0
[  926.384443]  [<c05407e0>] hub_thread+0x2ec/0x996
[  926.415421]  [<c0420d1e>] kthread+0x39/0x5f
[  926.445919]  [<c0404017>] kernel_thread_helper+0x7/0x10

This causes all USB operations to block.

  Changed 7 years ago by wad

  • status changed from reopened to new
  • owner changed from wad to dwmw2

  Changed 6 years ago by thomaswamm

  • next_action set to never set

#8060 : "libertas: lights out and can't scan" might provide further evidence (wlan and USB both dead).

  Changed 4 years ago by wad

  • component changed from hardware to wireless

This was almost certainly a problem with the libertas firmware/driver, and not a hardware problem per se. Reassigning...

Note: See TracTickets for help on using tickets.