Ticket #1752 (closed defect: fixed)

Opened 7 years ago

Last modified 7 years ago

USB wireless suspend/resume failure at setup phase

Reported by: marcelo Owned by: wad
Priority: blocker Milestone: 8.2.0 (was Update.2)
Component: wireless Version:
Keywords: power Cc: jordan.crouse@…, wmb@…, luisca@…, javier@…, rchokshi@…, rsmith, GR-Wireless-OLPC@…, dilinger, marcelo@…, jg
Action Needed: Verified: no
Deployments affected: Blocked By:
Blocking:

Description (last modified by jg) (diff)

Failure at different "commands" of the SETUP phase, such as SetAddress, GetDescriptor(Device), GetDescriptor(Configuration).

By analyzing USB traces, we can see that the SETUP transaction for these commands is usually successful, but subsequent steps (such as IN transactions in response to SETUP, or PING transactions in response to OUT) continue to happen for a while and then stop.

3ms after the last IN or PING transaction is seen, the device is supposed to enter suspended state (and the analyzer indicates that with "SUSPENDED (IDLE)"). The USB software then receives errors from the EHCI controller for these commands.

For the successful case, we see that the host controller has issued "non-stop" IN or PING transactions until it got an ACK from the device, which prevents the suspended state.

So from my understand of the USB protocol the host controller is at fault here, it should issue IN/PING until an ACK is received.

At http://dev.laptop.org/~marcelo/debug/ you can find:

success_2622-rc4.ufo: successfull suspend/resume cycle

set_address_2_failure.ufo: failure at the SetAddress(2) stage.

USB Explorer 200 Windows software available at http://www.ellisys.com is necessary to decode the traces.

Attachments

wakeup-failure.log (108.9 kB) - added by cjb 7 years ago.

Change History

  Changed 7 years ago by marcelo

  • cc rchokshi@… added

  Changed 7 years ago by marcelo

Additional info: a full reset of the EHCI host controller (performed by reloading the ehci-hcd module), does _not_ fix the problem:

suspend
...
resume
...
[    2.277616] usb_start_wait_urb: wait done
[    2.281657] USB_REQ_SET_CONFIGURATION!
[    2.285518] usb 1-1: can't restore configuration #1 (error=-71)
...
[   14.134066] usbcore: deregistering interface driver usb8xxx
[   36.122507] ehci_hcd 0000:00:0f.5: remove, state 1
[   36.122859] usb usb1: USB disconnect, address 1
...
[   42.858811] ehci_hcd 0000:00:0f.5: EHCI Host Controller
...
[   42.889239] ehci_hcd 0000:00:0f.5: USB 2.0 started, EHCI 1.00, driver 10 Dec
2004
...
[   43.436950] usb 1-1: unable to read config index 0 descriptor/start
[   43.436970] usb 1-1: chopping to 0 config(s)

Which, from my understand, discards the possibility of 5536 errata #56 from hitting us (Stability issues upon USB EHC Port Suspend in S0). The suggested workaround for this errata is a full EHCI reinitialization.

I also attempted to reset the wireless module using the special GPIO on the EC. The reset succeeds (as can be seen by the wireless LEDs going off), but the communication is still hosed.

  Changed 7 years ago by marcelo

Additional information:

The error is detected by the kernel via the status field for the current qTD token structure for the failed command, and it is (from the EHCI spec):

3 Transaction Error (XactErr). Set to a one by the Host Controller during status
update in the case where the host did not receive a valid response from the
device (Timeout, CRC, Bad PID, etc.). Refer to Section 4.15.1.1 for summary of
the conditions that affect this bit. If the host controller sets this bit to a one, then
it remains a one for the duration of the transfer.

  Changed 7 years ago by jg

  • priority changed from normal to blocker
  • description modified (diff)
  • milestone changed from Untriaged to Trial-2

  Changed 7 years ago by jcardona

Captured another instance of the problem, available here: https://cozy1/~javier/debug_olpc_usb/resume-pings-stop-during-enumeration.usb

You will need the USBMobileHS software application available for free (prior registration) on LeCroy's website: http://www.lecroy.com/tm/Library/Software/PSG/usbmobile.asp?menuid=8

Here we see that the first GetDescriptor transfer fails:

Transaction. 0:     Setup request is ACKed by device.
Transactions 1-28:  In packets, NAKed by device.
Transaction  29:    Device sends 18 bytes of descriptor.
Transaction  30:    Host sends 0 byte packet to device (high level ack), which is NACKed
Transaction  31-34: Host sends 4 ping packets which are NACKed.

Then no more activity for 3ms, after which the bus enters suspended state.

follow-up: ↓ 7   Changed 7 years ago by cjb

follow-up: ↓ 9   Changed 7 years ago by jcardona

Tested the suspend/resume cycles, wireless initiated resume on the following configuration:

OFW: q2c17
wlan fw: 5.110.16
kernel: git master
commit 50822dfb280c7d63f5cf215b78ef37978060f158
+ patches http://dev.laptop.org/~marcelo/libertas_resume
+ CONFIG_DISABLE_CONSOLE_SUSPEND=y
+ CONFIG_USB_PERSIST=y
+ CONFIG_USB_EHCI_HCD=m

Two different laptops show very different behaviors:

xo1 - B3-4 SHF720000B0, resumes well. We can have hundreds of suspend/resume cycles before seeing a problem. The problems that we have seen are #1814 and #1835.

xo2 - B3-2 SHF720000B0, fails to resume. Maximum successful iterations ~10. Logs of the failure are here and here

As mentioned by marcelo in the second comment above, reloading the host controller driver does not resolve the problem. Logs here.

cjb has a laptop that also fails in the same way as xo2. Logs here.

in reply to: ↑ 8   Changed 7 years ago by jcardona

One more test: I added a debugfs hook that sends the command to the wlan firmware to detach from USB. Then, using sdkit I toggled GPIO15, which signals the wlan firmware to reattach to the USB bus.

The results on both laptops were identical: the wireless module detaches correctly, and reattaches when signaled. The setup phase (enumeration) is successful. Logs for xo1 and xo2. (Note that the wlan reset that appears after enumeration is triggered by the driver when it fails to download the firmware. Under normal resume, there is no attempt to download firmware.)

follow-up: ↓ 12   Changed 7 years ago by jcardona

  • owner changed from marcelo to richard

One more test that finally throws some light into the problem: we have modified the wireless firmware to reattach to USB before waking up the host, instead of waiting for the host to issue the request. At the time of this writing we've done over 55 successful suspend/resume iterations on xo2, the same xo on which we could not do more than 10 before.

*Suggested fix:* Have EC send the 'USB attach' signal to the wireless module before waking up the host.

  Changed 7 years ago by jcardona

One more note on this...

When we disable in the wireless firmware the GPIO2 (WLAN_WAKE) retries, the host will fail to wake up approximately every 3rd time. cjb put a scope on that signal and did not see anything suspicious, so it all points to a problem in the EC code? As reported above, we are using q2c17.

in reply to: ↑ 10 ; follow-up: ↓ 13   Changed 7 years ago by cjb

  • cc rsmith added

Replying to jcardona:

One more test that finally throws some light into the problem: we have modified the wireless firmware to reattach to USB before waking up the host, instead of waiting for the host to issue the request. At the time of this writing we've done over 55 successful suspend/resume iterations on xo2, the same xo on which we could not do more than 10 before.

Excellent! Are you able to share this firmware with me for testing here?

*Suggested fix:* Have EC send the 'USB attach' signal to the wireless module before waking up the host.

Richard, if you're able to get me an EC (I can brick-test myself) update that does this, it would be a massive help..

in reply to: ↑ 12 ; follow-up: ↓ 14   Changed 7 years ago by rsmith

Replying to cjb:

Replying to jcardona: Excellent! Are you able to share this firmware with me for testing here?

*Suggested fix:* Have EC send the 'USB attach' signal to the wireless module before waking up the host.

Richard, if you're able to get me an EC (I can brick-test myself) update that does this, it would be a massive help..

I will create some firmware that does this but I see this as a band aid fix. There is still a problem here and if we don't understand it then its sure to return and bite us again later.

What this suggests to me is that the resume path has some timing issues. Lets think about how to narrow this down further.

in reply to: ↑ 13   Changed 7 years ago by cjb

Replying to rsmith:

I will create some firmware that does this but I see this as a band aid fix. There is still a problem here and if we don't understand it then its sure to return and bite us again later.

Richard's new EC gets us to the stability of the no-detach firmware without using a modified wireless firmware, which is a big step forward. I'm seeing a bug, which doesn't look like a wireless bug, where the laptop just outright refuses to wake up after some number of suspend/resumes. Pressing the power/game buttons doesn't get me a "+r", and "olpc_do_sleep!" is the last thing seen from the kernel, which suggests (as do the LEDs having gone off) that we powered down normally.

dmesg attached; there are 67 successful resumes, and then the failure.

Note that we occasionally see things like:

[  194.141477] olpc_do_sleep!
+r[    0.396028] Timeout waiting for EC to read command!
[    0.545181] Timeout waiting for EC to read command!

in the log, so it sounds like the EC interaction is not yet robust.

What this suggests to me is that the resume path has some timing issues. Lets think about how to narrow this down further.

(Agreed.)

Changed 7 years ago by cjb

  Changed 7 years ago by cjb

Javier's already filed a bug on this wakeup failure (#1835). I also tried wakeup from wireless; that didn't work either.

follow-up: ↓ 17   Changed 7 years ago by cjb

Hm. Actually, while we resume okay, I'm having trouble pinging anything after a resume with the new EC, on either 9854.. or 37cc.. wireless firmware. The activity lights flash appropriately, but I'm not getting replies.

in reply to: ↑ 16   Changed 7 years ago by luisca

Replying to cjb:

Hm. Actually, while we resume okay, I'm having trouble pinging anything after a resume with the new EC, on either 9854.. or 37cc.. wireless firmware. The activity lights flash appropriately, but I'm not getting replies.

Does ifconfig/iwconfig output seem normal? Can you communicate with the device (e.g. iwpriv msh0 fwt_time)

  Changed 7 years ago by rchokshi

  • cc GR-Wireless-OLPC@… added

  Changed 7 years ago by cjb

Does ifconfig/iwconfig output seem normal? Can you communicate with the device (e.g. iwpriv msh0 fwt_time)

Yes, all normal. fwt_time:56465090

Have you had a chance to try the modified EC yet? It's:

http://dev.laptop.org/~rsmith/q2cd74.rom

  Changed 7 years ago by cjb

  • cc dilinger added

With wireless firmware 16.p0 I fail quickly. On first resume:

[  560.802889] olpc_do_sleep!
+r[    0.089394] Timeout waiting for EC to read command!
[    0.888823] usb usb1: root hub lost power or was reset
[    1.132052] usb_reset_device!
[    1.212156] hub_port_wait_reset: portstatus=503 portchange=10
[    1.361627] hub_port_wait_reset: portstatus=503 portchange=10
[    1.441649] devpath 1 ep0out 3strikes
[    1.670587] devpath 1 ep0out 3strikes
[    1.898855] usb 1-1: device not accepting address 2, error -71
[    1.981515] hub_port_wait_reset: portstatus=503 portchange=10
[    2.128364] hub_port_wait_reset: portstatus=503 portchange=10
[    2.241735] devpath 1 ep0in 3strikes
[    2.263542] usb 1-1: device descriptor read/all, error -71
[    2.352074] hub_port_wait_reset: portstatus=503 portchange=10
[    2.471837] devpath 1 ep0in 3strikes
[    2.494652] usb 1-1: device descriptor read/8, error -71
[    2.651779] devpath 1 ep0in 3strikes
[    2.674995] usb 1-1: device descriptor read/8, error -71
[    2.870479] hub_port_wait_reset: portstatus=503 portchange=10
[    2.990001] old descriptor:
[    3.013106] bLength: 12
[    3.035505] bDescriptorType: 1
[    3.058345] bcdUSB: 200
[    3.080465] bDeviceClass: 0
[    3.102676] bDeviceSubClass: 0
[    3.124918] bDeviceProtocol: 0
[    3.146855] bMaxPacketSize0: 40
[    3.168882] idVendor: 1286
[    3.190148] idProduct: 2001
[    3.211139] bcdDevice: 3107
[    3.231915] iManufacturer: 1
[    3.252753] iProduct: 2
[    3.272615] iSerialNumber: 0
[    3.292416] bNumConfigurations: 1
[    3.312581] new descriptor:
[    3.332417] bLength: 12
[    3.351422] bDescriptorType: 1
[    3.370887] bcdUSB: 200
[    3.389498] bDeviceClass: 0
[    3.408444] bDeviceSubClass: 0
[    3.427513] bDeviceProtocol: 0
[    3.446280] bMaxPacketSize0: 40
[    3.464734] idVendor: 1286
[    3.482461] idProduct: 2001
[    3.499882] bcdDevice: 3105
[    3.517425] iManufacturer: 1
[    3.534724] iProduct: 2
[    3.551657] iSerialNumber: 0
[    3.568682] bNumConfigurations: 1
[    3.586346] USB_REQ_SET_CONFIGURATION!
[    3.606603] devpath 1 ep0out 3strikes
[    3.624703] dev->type = c068ae78
[    3.641540] devpath 1 ep2out 3strikes
[    3.658557] devpath 1 ep3in 3strikes
[    3.675041] calling resume directly!
[    3.691909] devpath 1 ep2out 3strikes
[    3.708074] devpath 1 ep3in 3strikes
[    3.799525] Restarting tasks ... <3>hub 1-0:1.0: port 1 disabled by hub (EMI.
[    3.899555] done.
-bash-3.1# [    5.117960] hub_port_wait_reset: portstatus=503 portchange=10
[    5.258437] hub_port_wait_reset: portstatus=503 portchange=10
[    6.881403] devpath 1 ep2out 3strikes
[    6.898634] devpath 1 ep3in 3strikes
[    7.980491] devpath 1 ep2out 3strikes
[    9.079246] devpath 1 ep2out 3strikes
[    9.369582] libertas: firmware not ready
[   10.176052] devpath 1 ep2out 3strikes
[   11.273981] devpath 1 ep2out 3strikes
[   12.379469] usb_reset_device!
[   12.455757] hub_port_wait_reset: portstatus=503 portchange=12
[   12.592978] hub_port_wait_reset: portstatus=503 portchange=12
[   12.709215] devpath 1 ep0in 3strikes
[   12.727176] usb 1-1: device descriptor read/all, error -71
[   12.805816] hub_port_wait_reset: portstatus=503 portchange=10
[   12.943040] hub_port_wait_reset: portstatus=503 portchange=10
[   13.060172] USB_REQ_SET_CONFIGURATION!
[   13.092709] Resetting OLPC wireless...
[   13.278743] devpath 1 ep2out 3strikes
[   13.298012] devpath 1 ep3in 3strikes
[   14.375533] usb_reset_device!
[   14.458645] hub_port_wait_reset: portstatus=503 portchange=11
[   14.480786] hub_port_wait_reset: connection bounced!
[   14.502137] logical disconnect on port 1
[   14.522655] libertas: firmware init failed
[   16.664580] usb8xxx: probe of 1-1:1.0 failed with error -12
[   16.914708] hub_port_wait_reset: portstatus=503 portchange=10
[   17.058184] hub_port_wait_reset: portstatus=503 portchange=10
[   17.175289] devpath 1 ep0in 3strikes
[   17.196685] usb 1-1: device descriptor read/all, error -71
[   17.277889] hub_port_wait_reset: portstatus=503 portchange=10
[   17.426858] hub_port_wait_reset: portstatus=503 portchange=10
[   17.540093] devpath 1 ep0in 3strikes
[   17.562213] devpath 1 ep0in 3strikes
[   17.584083] devpath 1 ep0in 3strikes
[   17.605705] devpath 1 ep0in 3strikes
[   17.627451] devpath 1 ep0out 3strikes
[   17.648639] usb 1-1: can't set config #1, error -71
[   17.670807] hub 1-0:1.0: port 1 disabled by hub (EMI?), re-enabling...
[   17.754308] hub_port_wait_reset: portstatus=503 portchange=10
[   17.894403] hub_port_wait_reset: portstatus=503 portchange=12
[   18.045126] devpath 1 ep3in 3strikes
[   18.157364] devpath 1 ep2out 3strikes
[   20.356524] usb_reset_device!
[   20.432817] hub_port_wait_reset: portstatus=503 portchange=12
[   20.575915] hub_port_wait_reset: portstatus=503 portchange=12
[   20.655011] libertas: firmware not ready
[   20.704653] devpath 1 ep0in 3strikes
[   20.725973] usb 1-1: device descriptor read/all, error -71
[   20.804755] hub_port_wait_reset: portstatus=503 portchange=10
[   20.945351] hub_port_wait_reset: portstatus=503 portchange=12
[   21.052603] USB_REQ_SET_CONFIGURATION!
[   21.075831] devpath 1 ep0out 3strikes
[   21.111156] Resetting OLPC wireless...
[   21.132697] devpath 1 ep2out 3strikes
[   21.305050] devpath 1 ep2out 3strikes
[   21.325972] devpath 1 ep3in 3strikes
[   22.405690] usb_reset_device!
[   22.480575] hub_port_wait_reset: portstatus=503 portchange=13
[   22.504016] hub_port_wait_reset: connection bounced!
[   22.526615] logical disconnect on port 1
[   22.547939] libertas: firmware init failed
[   22.633298] usb8xxx: probe of 1-1:1.0 failed with error -12
[   22.713412] hub_port_wait_reset: portstatus=503 portchange=10
[   22.853011] hub_port_wait_reset: portstatus=503 portchange=10
[   22.964880] devpath 1 ep0in 3strikes
[   22.985691] usb 1-1: unable to read config index 0 descriptor/all
[   23.009853] usb 1-1: can't read configurations, error -71
[   23.092719] hub_port_wait_reset: portstatus=503 portchange=10
[   23.229944] hub_port_wait_reset: portstatus=503 portchange=10
[   23.474399] devpath 1 ep3in 3strikes
[   23.571140] devpath 1 ep2out 3strikes
[   25.774965] usb_reset_device!
[   25.855217] hub_port_wait_reset: portstatus=503 portchange=12
[   26.005062] hub_port_wait_reset: portstatus=503 portchange=12
[   26.125671] devpath 1 ep0in 3strikes
[   26.147952] old descriptor:
[   26.169191] bLength: 12
[   26.189501] bDescriptorType: 1
[   26.210055] bcdUSB: 200
[   26.230073] bDeviceClass: 0
[   26.250026] bDeviceSubClass: 0
[   26.270192] bDeviceProtocol: 0
[   26.290146] bMaxPacketSize0: 40
[   26.309794] idVendor: 1286
[   26.329416] idProduct: 2001
[   26.348511] bcdDevice: 3107
[   26.367397] iManufacturer: 1
[   26.386068] iProduct: 2
[   26.403856] iSerialNumber: 0
[   26.422284] bNumConfigurations: 1
[   26.440832] new descriptor:
[   26.458712] bLength: 12
[   26.476072] bDescriptorType: 1
[   26.494015] bcdUSB: 200
[   26.511433] bDeviceClass: 0
[   26.529281] bDeviceSubClass: 0
[   26.547251] bDeviceProtocol: 0
[   26.564764] bMaxPacketSize0: 40
[   26.582215] idVendor: 1286
[   26.598964] idProduct: 2001
[   26.615594] bcdDevice: 3107
[   26.631824] iManufacturer: 1
[   26.647720] iProduct: 2
[   26.663310] iSerialNumber: 0
[   26.678963] bNumConfigurations: 1
[   26.694814] devpath 1 ep0out 3strikes
[   26.710745] USB_REQ_SET_CONFIGURATION!
[   26.726809] devpath 1 ep0out 3strikes
[   26.742816] usb 1-1: can't restore configuration #1 (error=-71)
[   26.761442] logical disconnect on port 1
[   26.778192] libertas: firmware init failed
[   27.274618] usb8xxx: probe of 1-1:1.0 failed with error -12
[   27.348948] hub_port_wait_reset: portstatus=503 portchange=10
[   27.432563] devpath 1 ep0in 3strikes
[   27.449804] devpath 1 ep0in 3strikes
[   27.466676] devpath 1 ep0in 3strikes
[   27.538792] hub_port_wait_reset: portstatus=503 portchange=12
[   27.608514] usb 1-1: device descriptor read/64, error -71
[   27.794621] hub_port_wait_reset: portstatus=503 portchange=12
[   27.871742] devpath 1 ep0out 3strikes
[   28.103232] devpath 1 ep0out 3strikes
[   28.328971] usb 1-1: device not accepting address 9, error -71
[   28.413784] hub_port_wait_reset: portstatus=100 portchange=10
[   28.435041] hub_port_wait_reset: device went away!

With 15.5, it successfully resumes a few times, but then:

+r[    0.089589] Timeout waiting for EC to read command!
[    0.709093] usb usb1: root hub lost power or was reset
[    0.950367] usb_reset_device!
[    1.031939] hub_port_wait_reset: portstatus=503 portchange=10

and we're hung forever after the hub_port_wait_reset.

It would be good to able to rule out the EC from these bugs, given the timeouts; we'll need Richard/Andres for that.

  Changed 7 years ago by marcelo

New datapoint on failure with the detach firmware. On the error case the HUB wPortChange register has bit 1 set:

hub_port_wait_reset: portstatus=503 portchange=12

bit 1 of portchange (from the USB2 spec): Port Enable/Disable Change: (C_PORT_ENABLE) This field is set to one when a port is disabled because of a Port_Error condition (see Section 11.8.1).

11.8.1 Port Error
A Port Error can occur on a downstream facing port that is in the Enabled state. A Port Error
condition exists when:

• The hub is in the WFEOP state with connectivity established upstream from the port when the
(micro)frame timer reaches the EOF2 point.
• At the EOF2 point, the Hub Repeater is in the WFSOPFU state, and there is other
than Idle state on the port.

Since this error occurs just after we do a port reset, I think the hub repeater is in the WFSOPFU state (accordingly to table "11.7.2.3 Repeater State Machine" on page 328)

follow-up: ↓ 23   Changed 7 years ago by jg

  • owner changed from richard to rchokshi
  • milestone changed from Trial-2 to Trial-3

Ronak, could you have your wizards look at this one when you get machines?

in reply to: ↑ 22 ; follow-ups: ↓ 24 ↓ 25   Changed 7 years ago by rsmith

  • milestone changed from Trial-3 to Trial-2

Replying to jg:

Ronak, could you have your wizards look at this one when you get machines?

http://dev.laptop.org/~rsmith/crash.zip

This file contains a ellisys Visual USB trace of the crash. Nothing special was done with the setup. This was a os540 image that I just booted, started recording and then did suspend/resume cycles. There are about 5 or 6 or so suspend/resume cycles that worked and then the last one hung the machine.

in reply to: ↑ 23   Changed 7 years ago by rchokshi

Replying to rsmith:

http://dev.laptop.org/~rsmith/crash.zip This file contains a ellisys Visual USB trace of the crash. Nothing special was done with the setup. This was a os540 image that I just booted, started recording and then did suspend/resume cycles. There are about 5 or 6 or so suspend/resume cycles that worked and then the last one hung the machine.

Was this test done with continuous outgoing or incoming traffic over the wireless interface or was the XO idle? How was the resume done - initiated through the wireless packets or through buttons on the XO?

in reply to: ↑ 23 ; follow-up: ↓ 26   Changed 7 years ago by rchokshi

Replying to rsmith:

http://dev.laptop.org/~rsmith/crash.zip This file contains a ellisys Visual USB trace of the crash. Nothing special was done with the setup. This was a os540 image that I just booted, started recording and then did suspend/resume cycles. There are about 5 or 6 or so suspend/resume cycles that worked and then the last one hung the machine.

This trace is quite inconclusive, honestly. In the final crash, it only shows an incomplete IN transaction.

in reply to: ↑ 25 ; follow-up: ↓ 27   Changed 7 years ago by rsmith

suspend/resume cycles. There are about 5 or 6 or so suspend/resume cycles that worked and then the last one hung the machine.

This trace is quite inconclusive, honestly. In the final crash, it only shows an incomplete IN transaction.

I have to work on EC problems but Chris Ball can drive the analyzer. Its set up and ready to go. Tell us how to give you more info. More traces of the same thing? More retries in the kernel?

We can also do some sort of VNC thing where we talk on the phone and you can look at the data in real time.

in reply to: ↑ 26   Changed 7 years ago by rchokshi

  • cc marcelo@… added

Replying to rsmith:

suspend/resume cycles. There are about 5 or 6 or so suspend/resume cycles that worked and then the last one hung the machine.

This trace is quite inconclusive, honestly. In the final crash, it only shows an incomplete IN transaction.

I have to work on EC problems but Chris Ball can drive the analyzer. Its set up and ready to go. Tell us how to give you more info. More traces of the same thing? More retries in the kernel?

I think it would be great if you can send a bad trace and a good trace with the circumstances in Trac # 2621. This will probably go a long way in explaining the root cause of this issue.

We can also do some sort of VNC thing where we talk on the phone and you can look at the data in real time.

Yes, this might help. Please let me know what time would be good for you. We will create our schedule during that time.

In addition to my other questions above which are still unanswered:

It is clear that during the suspend/resume cycle, the firmware is not involved at all, i.e. it is in fully awake state.

So, I cannot think of any possibility of failure that could be caused by firmware because the device basically do almost nothing during suspend/resume cycle.

Because of that, this suspend/resume cycle is eventually the same as enumeration cycle to the firmware side. So, we thought that doing just a regression on the enumeration cycle would suffice to test whether the USB design of 8388 is an issue. So, we did enumeration cycle test. It was okay. This is assuming that the wakeup is not wireless-initiated.

Enumeration cycle test can be done with the software tool in Win XP PC. That tool can be downloaded from usb.org web site.

Could you confirm if you are using 5.110.17.p0?

  Changed 7 years ago by jg

  • keywords power added

  Changed 7 years ago by jg

  • milestone changed from Trial-2 to Trial-3

  Changed 7 years ago by jg

  • cc jg added

  Changed 7 years ago by kimquirk

  • milestone changed from Untriaged to Trial-3

  Changed 7 years ago by wad

  • owner changed from rchokshi to wad

We are continuing to see this bug, although with greatly diminished frequency on units with the Suspend ECO B4 Suspend ECO applied.

We will attempt to gather a dump of crash information for more recent analysis.

To a user, this problem is very similar to #1835.

Perhaps there is some connection with #4131, where communications with the WLAN module continues fine, but there is no wireless connectivity.

  Changed 7 years ago by kimquirk

  • milestone changed from Trial-3 to First Deployment, V1.0

Continuing to work on this bug... not in trial-3

  Changed 7 years ago by wad

This bug could also be confused with #2621. #2621 was a similar problem that was due to incoming network traffic waking up the laptop before it was successfully suspended. The EC firmware was modified to delay the wakeup interrupt in this case. Released firmware q2c28 and later have this modification, and should be used when debugging #1752.

  Changed 7 years ago by kimquirk

Wad, can you respond as to whether this particular bug should remain open at this point? we're trying to close down the release and it seems like this is fixed...if there are still issues from deep inside this bug, can you open a new bug?

  Changed 7 years ago by wad

  • status changed from new to closed
  • resolution set to fixed

Closed at this time. This appears to have been ameliorated by the addition of better HF bypassing to the USB power supplies on the Southbridge and WLAN modules

Note: See TracTickets for help on using tickets.