Ticket #11077 (closed defect: worksforme)

Opened 3 years ago

Last modified 2 years ago

XO-1.75 mmc0: Timeout waiting for hardware interrupt

Reported by: Quozl Owned by: saadia
Priority: blocker Milestone: 1.75-software
Component: kernel Version: not specified
Keywords: Cc: wad
Action Needed: no action Verified: no
Deployments affected: Blocked By:
Blocking:

Description

A kernel message appears frequently during runin testing on A3 and B1. It does not impact the 12-hour rate of the runin-sdwrite test. It does not have any obvious effect other than the kernel message output.

[   36.064563] mmc0: Timeout waiting for hardware interrupt.
[   36.064588] sdhci: =========== REGISTER DUMP (mmc0)===========
[   36.064608] sdhci: Sys addr: 0x00000000 | Version:  0x00000002
[   36.064635] sdhci: Blk size: 0x00000000 | Blk cnt:  0x00000000
[   36.064635] sdhci: Argument: 0x000001aa | Trn mode: 0x00000000
[   36.064660] sdhci: Present:  0x01fa0000 | Host ctl: 0x00000001
[   36.064673] sdhci: Power:    0x0000000f | Blk gap:  0x00000000
[   36.064673] sdhci: Wake-up:  0x00000000 | Clock:    0x00000000
[   36.064685] sdhci: Timeout:  0x00000000 | Int stat: 0x00000000
[   36.064698] sdhci: Int enab: 0x00ff0003 | Sig enab: 0x00ff0003
[   36.064711] sdhci: AC12 err: 0x00000000 | Slot int: 0x00000000
[   36.064736] sdhci: Caps:     0x25fcc8b2 | Caps_1:   0x00002f77
[   36.064736] sdhci: Cmd:      0x0000081a | Max curr: 0x00000000
[   36.064760] sdhci: ===========================================

(The first line in the burst is level <3>, the subsequent are level <7>. Over a 12 hour test the frequency varies from 0 to 42 depending on unit. Only one test per unit run available.)

Change History

Changed 3 years ago by Quozl

In a batch of 27 runin test results, this message appears in kernel logs for 16 of the units.

Changed 3 years ago by wad

There is a definite correlation between this issue and the one in #11125.

Changed 3 years ago by wad

  • priority changed from normal to blocker
  • next_action changed from never set to diagnose

This has been seen with both SKU of B1 laptops (Toshiba AND Sandisk eMMCs). There are also a few B1 laptops that don't ever seem to show the problem.

Changed 3 years ago by wad

  • cc wad added

Changed 3 years ago by cjb

I've pushed patches to the automatic clock gating code that attempt to fix this (to the olpc-kernel repo). Saadia's leaving a test running to see if it reproduces, and I think it'd be a good idea to put the kernel changes in a build too.

Changed 3 years ago by martin.langhoff

Recent report of this at #11119 (closed as duplicate) during a yum update.

Changed 2 years ago by Quozl

  • status changed from new to closed
  • next_action changed from diagnose to no action
  • resolution set to worksforme

Did not occur in C1 runin logs, closing.

Note: See TracTickets for help on using tickets.