Opened 2 years ago

Last modified 19 months ago

#12151 new defect

mwifiex creates multiple interfaces by default

Reported by: shep Owned by: shep
Priority: normal Milestone: Future Release
Component: network manager Version: not specified
Keywords: Cc: dsd
Blocked By: Blocking:
Deployments affected: Action Needed: never set
Verified: no

Description

With "12.1.0 for XO-1.75, customized (build 22)" on an XO-4 B1 (software as it came from the factory, but sshd enabled), and an 8787 wireless card swapped in, I was noticing unusual latencies over ssh. It was bad enough it distracted me into investigating it.

Pinging the laptop showed 35% packet loss, and of those packets that were not lost many were delayed several seconds.

I managed to make this problem go away by shutting off NetworkManager and configuring the wireless interface by hand.

I will attach ping logs with (con) and without (sin) NetworkManager running.

With all the 5GHz channels available on the 8787 card, maybe scans take much longer and NetworkManager (or perhaps its clients?) are asking for scans often enough for this to be a problem.

Attachments (4)

8787_ping_con_NetworkManager.log (14.7 KB) - added by shep 2 years ago.
8787_ping_sin_NetworkManager.log (22.4 KB) - added by shep 2 years ago.
8686_ping_con_NetworkManager.log (22.5 KB) - added by shep 2 years ago.
8787_ping_con_NetworkManager_sin_uap0.log (21.9 KB) - added by shep 2 years ago.

Download all attachments as: .zip

Change History (15)

Changed 2 years ago by shep

Changed 2 years ago by shep

comment:1 Changed 2 years ago by dsd

  • Cc dsd added
  • Milestone changed from Not Triaged to 13.1.0
  • Owner changed from dsd to shep

As a next step, I would add/enable some driver debugging to see how long scans really take, and also you could look if you can adjust the NM scan interval somehow, and try increasing it as an experiment.

Changed 2 years ago by shep

comment:2 Changed 2 years ago by shep

At pgf's suggestion, I tried my ping experiment with a fresh XO-4 4B1 with the 8686 card with NetworkManager still running, and I don't see the problem.

So this problems seems to be the combination of mwifiex/8787 and NetworkManager. Which is what I thought this was.

Log of ping to the 8686 interface with NetworkManager running attached.

comment:3 Changed 2 years ago by shep

On irc while helping pgf figure out some weirdness he is seeing, it occurs to me that the root of this problem might be that NetworkManager finds both the mlan0 interface (renamed to eth0) and the uap0 interface and tries to make them both go, not realizing that underneath it is the same hardware.

I should look into this theory, and as pgf suggests one easy way to fix this might be to patch the mwifiex driver so it does not offer up the uap0 interface.

comment:4 Changed 2 years ago by dsd

A quick look at /var/log/messages would confirm or deny that. There are also the debugging suggestions I made above that might help you here.

comment:5 Changed 2 years ago by shep

By watching logs I was able to confirm that NetworkManager is trying to make both the mlan0 and the uap0 interface go at the same time.

Since NetworkManager seems to lack any reasonable way to configure it to ignore the uap0 interface by interface name, I crudely disabled the appearance of the uap0 interface in the mwifiex driver and it seems to have helped much. But I still see some packet lossage (now 3% but was 35% before) and delayed packets can be seen in the ping trace I just attached.

Looks like scans are taking around 10 seconds.

comment:6 Changed 2 years ago by dsd

The normal way for a wireless driver to offer auxiliary functionality (monitor mode, AP mode, etc) would be to allow a new interface to be created on the wiphy via cfg80211 (the 'iw' command in userspace). The special interfaces (e.g. AP) would not be created by default.

comment:7 Changed 2 years ago by dsd

It is also odd that the AP interface would advertise station capability (otherwise NM would ignore it), and that it supports scanning and so on. But maybe there are some ins and outs of this hardware that I'm not familiar with yet...

comment:8 Changed 2 years ago by shep

Through what api do the interfaces advertise capabilities such as "station"?

If I run iw dev it clearly shows the interface type of uap0 as "AP":

# iw dev
phy#0
        Interface uap0
                ifindex 3
                type AP
        Interface eth0
                ifindex 2
                type managed

using iw list I see some capabilities listed on each band (both bands the same):

Wiphy phy0
        Band 1:
                Capabilities: 0x16e
                        HT20/HT40
                        SM Power Save disabled
                        RX HT20 SGI
                        RX HT40 SGI
                        RX STBC 1-stream
                        Max AMSDU length: 3839 bytes
                        No DSSS/CCK HT40
[...]

comment:9 Changed 2 years ago by dsd

Ah, that looks good, I'm a little surprised that NM started to use it as if it were a station. Might be worth a quick mail to the NM list to clarify that this behaviour is intended.

comment:10 Changed 2 years ago by shep

With kernel patch c83171b43d2529e5a93b1621b1b51f5b76aa38fa from 3.5 weeks ago which crudely disables the creation of the uap0 interface, things seem pretty stable.

The remaining problem is the latency introduced by the scans, which when doing 1 Hz pings typically do not lead to any packets dropped but do delay some packets for as long as it takes the scan to happen (6 to 10 seconds to cover both bands). Maybe this ticket should be split into two new tickets... one for getting the right fix for mwifiex or NetworkManager to properly handle the presence of the uap0 interface (so we don't have to carry the crude patch that disables the uap0 interface), and another ticket to track the problem of delayed packets while the scan is happening.

A patch was posted to linux-wireless which is supposed to improve the delayed packets while scanning problem. We should investigate that patch.

comment:11 Changed 19 months ago by dsd

  • Milestone changed from 13.1.0 to Future Release
  • Summary changed from losses and high latency caused by NetworkManager with 8787 wireless module to mwifiex creates multiple interfaces by default

The issue with uap0 being created by default is still present, but we think we've convinced Marvell to fix that. For now we have a workaround in our builds.

The scan dwell times in 8787 were also quite high. We pushed a kernel patch (from upstream) to sanitize these values and the interruptions caused by scans are almost unnoticable now. Newer upstream kernels have a more complete solution where channels are scanned a bunch at a time, at intervals, to interrupt connectivity even less.

Note: See TracTickets for help on using tickets.