Opened 17 months ago

Closed 4 months ago

#12694 closed defect (fixed)

page allocation failure, mwifiex

Reported by: Quozl Owned by:
Priority: normal Milestone: 13.2.1
Component: kernel Version: Software Build 13.2.0-13
Keywords: Cc: dsd, godiard
Blocked By: Blocking:
Deployments affected: Action Needed: no action
Verified: no

Description

Copying many files over wireless to a system using rsync over ssh caused a kernel message and the system could no longer be accessed over wireless.

[  490.687834] ksdioirqd/mmc0: page allocation failure: order:0, mode:0x120
[  490.694515] [<c0015bc4>] (unwind_backtrace+0x0/0x128) from [<c0456418>] (dump_stack+0x20/0x24)
[  490.703066] [<c0456418>] (dump_stack+0x20/0x24) from [<c00aee68>] (warn_alloc_failed+0x108/0x128)
[  490.703066] [<c00aee68>] (warn_alloc_failed+0x108/0x128) from [<c00b1360>] (__alloc_pages_nodemask+0x53c/0x660)
[  490.711868] [<c00b1360>] (__alloc_pages_nodemask+0x53c/0x660) from [<c036fafc>] (netdev_alloc_frag+0x44/0xe8)
[  490.731716] [<c036fafc>] (netdev_alloc_frag+0x44/0xe8) from [<c0371194>] (__netdev_alloc_skb+0x3c/0xd4)
[  490.731716] [<c0371194>] (__netdev_alloc_skb+0x3c/0xd4) from [<bf0eab5c>] (mwifiex_process_int_status+0x314/0xa14 [mwifiex_sdio])
[  490.741050] [<bf0eab5c>] (mwifiex_process_int_status+0x314/0xa14 [mwifiex_sdio]) from [<bf0ba180>] (mwifiex_main_process+0xf4/0x4c8 [mwifiex])
[  490.765336] [<bf0ba180>] (mwifiex_main_process+0xf4/0x4c8 [mwifiex]) from [<bf0eb518>] (mwifiex_sdio_interrupt+0x15c/0x178 [mwifiex_sdio])
[  490.765336] [<bf0eb518>] (mwifiex_sdio_interrupt+0x15c/0x178 [mwifiex_sdio]) from [<c0305ba8>] (sdio_irq_thread+0x178/0x2f0)
[  490.777674] [<c0305ba8>] (sdio_irq_thread+0x178/0x2f0) from [<c00467a4>] (kthread+0x9c/0xac)
[  490.788805] [<c00467a4>] (kthread+0x9c/0xac) from [<c000fd80>] (kernel_thread_exit+0x0/0x8)
[  490.797181] Mem-info:
[  490.805456] Normal per-cpu:
[  490.807704] CPU    0: hi:  186, btch:  31 usd:  30
[  490.810470] HighMem per-cpu:
[  490.818069] CPU    0: hi:   90, btch:  15 usd:  41
[  490.822824] active_anon:21847 inactive_anon:1200 isolated_anon:0
[  490.822824]  active_file:17233 inactive_file:138038 isolated_file:0
[  490.822824]  unevictable:0 dirty:14511 writeback:0 unstable:0
[  490.822824]  free:42081 slab_reclaimable:9464 slab_unreclaimable:1717
[  490.822824]  mapped:10187 shmem:1481 pagetables:753 bounce:0
[  490.822824] Normal free:167780kB min:3508kB low:4384kB high:5260kB active_anon:3676kB inactive_anon:4kB active_file:33148kB inactive_file:415216kB unevictable:0kB isolated(anon):0kB isolated(file):0kB present:772160kB mlocked:0kB dirty:58044kB writeback:0kB mapped:0kB shmem:4kB slab_reclaimable:37856kB slab_unreclaimable:6868kB kernel_stack:1264kB pagetables:3012kB unstable:0kB bounce:0kB writeback_tmp:0kB pages_scanned:0 all_unreclaimable? no
[  490.852664] lowmem_reserve[]: 0 2032 2032
[  490.891616] HighMem free:544kB min:252kB low:544kB high:840kB active_anon:83712kB inactive_anon:4796kB active_file:35784kB inactive_file:136936kB unevictable:0kB isolated(anon):0kB isolated(file):0kB present:260096kB mlocked:0kB dirty:0kB writeback:0kB mapped:40748kB shmem:5920kB slab_reclaimable:0kB slab_unreclaimable:0kB kernel_stack:0kB pagetables:0kB unstable:0kB bounce:0kB writeback_tmp:0kB pages_scanned:0 all_unreclaimable? no
[  490.933607] lowmem_reserve[]: 0 0 0
[  490.933607] Normal: 1391*4kB 1389*8kB 1390*16kB 1391*32kB 1318*64kB 0*128kB 0*256kB 0*512kB 0*1024kB 0*2048kB 0*4096kB 0*8192kB 0*16384kB 0*32768kB = 167780kB
[  490.951113] HighMem: 2*4kB 1*8kB 1*16kB 0*32kB 0*64kB 0*128kB 0*256kB 1*512kB 0*1024kB 0*2048kB 0*4096kB 0*8192kB 0*16384kB 0*32768kB = 544kB
[  490.951113] 156761 total pagecache pages
[  490.967595] 0 pages in swap cache
[  490.970879] Swap cache stats: add 0, delete 0, find 0/0
[  490.970879] Free swap  = 0kB
[  490.976055] Total swap = 0kB
[  490.978907] 260096 pages of RAM
[  491.003494] 42528 free pages
[  491.006608] 4172 reserved pages
[  491.012568] 11181 slab pages
[  491.012568] 179484 pages shared
[  491.015420] 0 pages swap cached
[  491.021797] mwifiex_sdio mmc0:0001:1: mwifiex_process_int_status: failed to alloc skb
[  501.054266] mwifiex_sdio mmc0:0001:1: mwifiex_cmd_timeout_func: Timeout cmd id (1370331855.479994) = 0xa4, act = 0x0
[  501.064775] mwifiex_sdio mmc0:0001:1: num_data_h2c_failure = 0
[  501.064775] mwifiex_sdio mmc0:0001:1: num_cmd_h2c_failure = 0
[  501.076260] mwifiex_sdio mmc0:0001:1: num_cmd_timeout = 1
[  501.081621] mwifiex_sdio mmc0:0001:1: num_tx_timeout = 0
[  501.086903] mwifiex_sdio mmc0:0001:1: last_cmd_index = 2
[  501.092177] mwifiex_sdio mmc0:0001:1: last_cmd_resp_index = 1
[  501.097900] mwifiex_sdio mmc0:0001:1: last_event_index = 3
[  501.103342] mwifiex_sdio mmc0:0001:1: data_sent=0 cmd_sent=1
[  501.108971] mwifiex_sdio mmc0:0001:1: ps_mode=0 ps_state=0

This was on a custom build based on 13.2.0, with kernel hash aeff3e2, (despite f0a3916 being the latest in git), without Sugar. olpc-os-builder patch: http://dev.laptop.org/~quozl/z/1Ujm4Z.txt

Change History (18)

comment:1 Changed 17 months ago by Quozl

Reproduced, again using rsync over ssh, but this time the affected system was sending a filesystem tree to another system. The affected system in both cases has 1 GB memory.

comment:2 Changed 14 months ago by Quozl

  • Action Needed changed from reproduce to diagnose
  • Milestone changed from 13.2.0 to Future Release
  • Priority changed from low to normal
  • Version set to Software Build 13.2.0-13

Reproduced #12744 (closed, duplicate).

comment:3 Changed 13 months ago by Quozl

Also occurs using olpc-os-builder on a fresh 13.2.0-13 install.
http://wiki.laptop.org/go/OS_Builder/Run_on_XO_hardware

Workaround is to use USB ethernet adapter.

comment:4 Changed 12 months ago by Quozl

  • Action Needed changed from diagnose to design
  • Cc dsd added

Analysis: the driver is responding to an interrupt from the wireless card, the interrupt is for arriving data, and a memory allocation made at the time of the interrupt fails because the system is out of memory. The remainder of the job of reading the data from the card does not occur. The card is left in an inappropriate state. Further commands time out. powerd forks ethtool which hangs in D state. powerd does not recover, leaving the display off but the power indicator on.

A few things were looked at:

  • the failure is in drivers/net/wireless/mwifiex/sdio.c where the function mwifiex_process_int_status calls dev_alloc_skb,
  • there are no new changes to sdio.c in upstream kernel.org,
  • none of the changes in git seemed to relate to the problem,
  • to gather more data, sdio.c was changed to report allocation size; allocations for 256 bytes and 1792 bytes were seen; not unremarkable, and seems sane,
  • as speculation, sdio.c was changed to write 0x04 to configuration register as nearby code does; there was no change to symptom.

Speculation:

  • the wrong memory allocation technique is being used, (but my knowledge of the available techniques is insufficient),
  • the actions required to read the packets and dismiss the interrupt might be done even if memory cannot be allocated, losing the data, but at least the card should be left in a usable state.

comment:5 Changed 12 months ago by Quozl

  • considered the Marvell 473607 race condition as being a cause, but excluded as it relates primarily to the write bit map,
  • reviewed SDIO_Multi-Port_in_Marvell_MLAN_Driver_v0_12.pdf against sdio.c, determined a potential design flaw; a race condition; that the read bit map can have a bit set even though the length has not been set, but could not determine a mechanism by which this would cause #12694,
  • excluded possibility of an interrupt storm, by adding counters and printing them when dev_alloc_skb fails; the counters were not unusually large,
  • excluded bluetooth modules, by removing btmrvl* from filesystem and rebooting; no change,
  • began considering a simple deferral as a solution, by inserting code at the point of failure to restore state to what it was before the event, but more work is needed, in particular to reschedule mwifiex_main_process for a later time when memory may be available.

comment:6 Changed 12 months ago by Quozl

The best reproducer for this problem is:

# deplete memory
sudo dd if=/dev/mmcblk0 of=/dev/null &
# watch free until memory is mostly consumed by buffers or cache
# begin file transfer
rsync -avp server:big-file .
# expect kernel messages within 15 to 30 seconds

A patch was developed to do the actions required to read the packets and dismiss the interrupt ... even if memory cannot be allocated, losing the data,.

The patch is here: http://dev.laptop.org/~quozl/y/1VdGLj.txt

What this patch does:

  • turn off receive aggregation, (see comment below),
  • preallocate a buffer during driver startup,
  • use the buffer during receive interrupt when memory cannot be allocated,
  • avoid passing the buffer to the network stack,
  • recycle the buffer for further use.

What results were observed:

  • the rsync over ssh did not stall, whereas previously it did,
  • the consequent command timeout was not observed,
  • the failure to allocate was logged in kernel messages, quite often,
  • the lack of event or command response processing eventually resulted in process hangs of new command invocations of ifconfig or tcpdump.

Receive aggregation if left on resulted in corruption of the socket buffers, but this may have been my fault.

comment:7 Changed 12 months ago by Quozl

In a similar situation was a recommendation to increase min_free_kbytes, which has a value of 3511 on my test system, and changing this affected the probability of failure;

  • 32768, no apparent change,
  • 65536, no apparent change,
  • 131072, possibly decreased frequency, but dev_alloc_skb still did fail eventually,
  • 262144, eliminated the failure entirely.

Therefore a temporary workaround with a large impact (loss of useful memory) is:

 echo 262144 > /proc/sys/vm/min_free_kbytes

Also Russ Garrett says of min_free_kbytes that it is useful for swap-less machines, where you don’t want incoming network traffic to overwhelm the kernel and force an OOM before it has time to flush any buffers.

comment:8 Changed 12 months ago by Quozl

  • Cc godiard added
  • adding swap does not change symptom,
  • testing with 3d861f66 (net: fix secpath kmemleak) does not change symptom,
  • adding a queue of preallocated socket buffers for use only when memory is scarce, and hiding the page allocation failure, (see patch), does prevent the symptom,
  • extending this to disable the interrupt (using device registers) when the queue is short, and enabling the interrupt when the queue is restored, does not work; there is a command timeout.

comment:9 Changed 12 months ago by Quozl

  • the aforementioned patch relocated the page allocation failures deeper in the call stack, in tcp_send_ack, although they seem harmless.
  • therefore mwifiex was acting as an early warning, and there remains an underlying allocation problem to be solved.

comment:10 Changed 12 months ago by Quozl

Tests with unchanged kernel, arm-3.5 e77de3d, without using eMMC as target for the copy,

  • symptom does not occur using a USB hard drive,
  • symptom does occur using a USB flash drive, after many minutes.

comment:11 Changed 11 months ago by Quozl

  • changed driver to do a simple deferral, by inserting code at the point of failure to restore state to what it was before the event; no longer hangs, but has a side effect of consuming CPU in ksdioirqd kernel thread for the short time that memory is unavailable. Tests and refinement in progress.

comment:12 Changed 11 months ago by Quozl

Changes pushed to arm-3.5 to build a test kernel:

comment:13 Changed 11 months ago by Quozl

  • Action Needed changed from design to review

A kernel rpm is available with the changes mentioned above.

comment:14 Changed 5 months ago by Quozl

Tested with modified high memory threshold as used in #12781, no change to symptom.

comment:15 Changed 5 months ago by Quozl

  • Milestone changed from Future Release to 13.2.1

comment:16 Changed 5 months ago by Quozl

  • Action Needed changed from review to add to build

The adequate solution to be added to 13.2.1, then this ticket will return to design phase to locate a better solution, which would appear to be memory management related, and may be fixed by adoption of a later kernel.

comment:17 Changed 4 months ago by Quozl

  • Action Needed changed from add to build to test in release

Is in 13.2.1.

comment:18 Changed 4 months ago by Quozl

  • Action Needed changed from test in release to no action
  • Resolution set to fixed
  • Status changed from new to closed

Retested. Works as described. The patch series can be removed if a later kernel is ported, or if the problem is found to be related to CMA (#12785).

Note: See TracTickets for help on using tickets.