Opened 4 years ago

Closed 3 years ago

Last modified 3 years ago

#10366 closed defect (fixed)

idle suspend causes network connection in progress to fail

Reported by: greenfeld Owned by:
Priority: normal Milestone: 11.3.0
Component: not assigned Version: not specified
Keywords: Cc: pgf, sridhar
Blocked By: Blocking:
Deployments affected: Action Needed: no action
Verified: no

Description

This likely needs to be triaged, as I probably keep guessing wrong as to why eth0 loses track of its ESSID/IP Address information. Again this is with the RPMs from #9845 installed, but at first glance the problem appears to be below the Sugar layer.

My production XO-1.5 laptops occasionally lose track of what to connect to, even when they are supposed to default to an Adhoc network with the #9845 changes. So one time when I cleared the Network History from Sugar, clicked the checkmark to close the Network control panel, and pressed Ctrl-Alt-Esc to restart X & Sugar, I noticed that the #9845 change did not reconnect to the Adhoc #1 network by default like it should. So I enabled sugar & presence server debugging, and kept clearing the history/restarting X a few times on the laptop showing the issue (as well as one which wasn't showing it) trying to get the problem to reproduce and/or go away. {It happened twice on the broken one and persisted for a while each time; never on the non-broken XO 1.5, although the latter has shown this behavior before.}

Martin took an initial look, and although it may not relate to the above, the area around "Sep 15 17:14:52" in var/log/messages looked interesting. There, the laptop decided to sleep in the middle of NetworkManager setting up a connection. This resulted in NetworkManager deciding that the connection was invalid, and not trying to restore it.

Attached please find the system/sugar/powersave log files from the system which lost track of connections. I turned on verbose sugar debugging after the first time the issue was spotted.

Attachments (1)

logbundle.tgz (119.8 KB) - added by greenfeld 4 years ago.
Bundle of log files from the system which lost track of its ESSID & didn't connect to AdHoc1 instead

Download all attachments as: .zip

Change History (7)

Changed 4 years ago by greenfeld

Bundle of log files from the system which lost track of its ESSID & didn't connect to AdHoc1 instead

comment:1 Changed 4 years ago by Quozl

  • Action Needed changed from never set to diagnose
  • Cc pgf added
  • Component changed from network manager to not assigned
  • Milestone changed from Not Triaged to 10.1.3
  • Owner dsd deleted
  • Summary changed from Possible Networkmanager/S3 sleep race condition and/or other issue(s) to idle suspend causes network connection in progress to fail

Triage may include reworking the problem description, reproducing, prioritising, and proposing a milestone.

The problem is that an idle suspend interrupts the establishment of a connection by Network Manager, causing the attempt to connect to fail.

I have observed that in other contexts, with os852 unpatched, so I think this is reproduced.

I agree with a normal priority.

I propose 10.1.3 as milestone.

I don't think that the problem is necessarily in Network Manager, it might well be in powerd. I don't think Network Manager was ever intended to complete the establishment of a connection if that operation is interrupted by suspend.

comment:2 Changed 4 years ago by martin.langhoff

NM should know we're in the process of suspending -- I propose investigating if there is a POSIX signal or dbus msg powerd should be sending to NM.

comment:3 Changed 4 years ago by Quozl

I don't understand. Once a suspend is requested by powerd, the NetworkManager process may not execute again until resume. If NetworkManager is in the process of connecting, e.g. it has issued I/O calls to the network device and is waiting for the result, then it won't get the result until resume, and this delay may invalidate the connection timing; such as the DHCP negotiation.

comment:4 Changed 3 years ago by dsd

  • Action Needed changed from diagnose to test in build

powerd now monitors NM state and avoids idle-suspending while connecting to wifi. please test 11.3.0 build 5.

comment:5 Changed 3 years ago by greenfeld

  • Action Needed changed from test in build to no action
  • Resolution set to fixed
  • Status changed from new to closed

I have not seen a clear case of this happening, so the issue presumably is fixed in the 11.3.0 series.

However, we still may suspend prior to attempting any network connection.

comment:6 Changed 3 years ago by sridhar

  • Cc sridhar added
Note: See TracTickets for help on using tickets.