Ticket #8443 (new defect)

Opened 6 years ago

Last modified 5 years ago

Battery Percentage doesn't update until minutes after resume

Reported by: gnu Owned by: cjb
Priority: high Milestone: 8.2.0 (was Update.2)
Component: power manager (OHM) Version: Development build as of this date
Keywords: blocks-:8.2.0 polish:8.2.0 cjbfor8.2 relnote Cc: dsaxena, dilinger, smithbone, mikus@…
Action Needed: design Verified: no
Deployments affected: Blocked By:
Blocking:

Description

C2, Q2E15, 8.2-759.

After the XO comes out of a lid-close suspend, it lies about the state of its battery. The battery could be significantly changed from its state when it went into suspend. But the "My Battery" display does not change for minutes -- perhaps until a 1% change is seen by the CPU -- even though the battery charge itself has changed. Then it suddenly jumps to a new value.

This is not the same bug as #8010. In #8010, the battery percentage didn't update until capacity reached 14% (low battery warning). That got fixed. In this bug, it doesn't update until the battery changes by 1% AFTER resume -- even though the capacity changed WHILE SUSPENDED.

Fix: When resuming, always refresh the current battery status by polling the battery, rather than awaiting a 1%-change notification.

|TestCase|

Remove AC power from a laptop. Put the cursor in the bottom right corner, bringing up the Frame. Put the cursor on the battery symbol until it displays the percentage full. Note the percentage. Close the lid. Now leave the laptop suspended for many hours. Open the lid. If the bug still exists, the percentage on the screen will be unchanged. If the bug still exists, you can move the cursor to remove the Frame, re-bring up the Frame, hover over the battery, and it will again tell you the same old percentage. If the bug still exists, and you leave the cursor hovering on the battery, at an unpredictable time after many minutes, the battery percentage will suddenly jump to a new number, without going through any of the intervening numbers.

You can also trigger this bug by closing the lid on a somewhat depleted battery, then plugging it into power while it's closed, letting it charge a while, then opening the lid. You'll see the old battery charge, not the new one, on the screen. This value will persist for minutes. If you want to see the real value, remove the AC power; the display will jump to the real charge percentage.

I did not test this case, but you may be able to trigger this bug by closing the lid on a somewhat depleted battery, then plugging it into power while it's closed, charging it fully, then opening the lid. You'll see the old battery charge, and this bogus value will probably persist indefinitely, since when it's already full the laptop will neither deplete the battery by 1% nor increase its charge by 1%. Removing AC power should cause it to update.

Attachments

pwr-080926-195057-96312a000000fcff.csv (391 bytes) - added by mikus 6 years ago.
yesterday's olpc-pwr-log snippet after LED turns green, showing battery charge "Full'.

Change History

  Changed 6 years ago by cjb

  • cc dsaxena, dilinger, smithbone added
  • keywords blocks-:8.2.0 polish:8.2.0 cjbfor8.2 added; blocks?:8.2.0 removed
  • milestone changed from Not Triaged to 8.2.0 (was Update.2)

I think it's too late to block on this, but it's a valid bug. I'm not sure what the right way to push out the new value is -- Sugar reads battery status from HAL, and HAL gets battery status interrupts from the kernel, so should the kernel be creating one of these interrupts at resume-time?

  Changed 6 years ago by pgf

i would have thought it was HAL's job to notice that a suspend/resume had happened, and update its internal state accordingly. (but perhaps, as we discussed yesterday, there's no easy way for HAL to discover that?)

  Changed 6 years ago by dsaxena

Is there some way for OHM to poke HAL to re-read the battery state from sysfs?

  Changed 6 years ago by cjb

  • keywords relnote added

follow-up: ↓ 6   Changed 6 years ago by cjb

Not that I know of, OHM's interactions with HAL pretty much consist of reading what HAL tells it at the moment. I'll think about it.

in reply to: ↑ 5   Changed 6 years ago by rsmith

Replying to cjb:

Not that I know of, OHM's interactions with HAL pretty much consist of reading what HAL tells it at the moment. I'll think about it.

Hmmm. This might have worked a few EC versions back. Previously on resume there were a bunch of extra SCI's that were generated I'm not sure if a battery status sci was one of them but might have been. I "fixed" all that cause :)

If needed I can generate a battery 1% SCI or power status change on wakeup since the EC knows when the system wakes. It wont be maskable (unless we use a separate bit) but it would never get generated as a wakup only after a suspend->active transition. I have that state already for dealing with the power button LED so its a trival mod.

  Changed 6 years ago by thomaswamm

  • next_action changed from never set to design
  • version changed from not specified to Development build as of this date

I have seen this bug often. I got used to it, having learned what goes on inside the XO. But this bug really should be fixed before the next million XO's are shipped. Most users won't read release notes, but will call tech support.

\\//_ Little bugs are big bugs when seen by a million users. _\\//

follow-up: ↓ 9   Changed 6 years ago by mikus

  • cc mikus@… added

I had a situation where the battery eeprom was somehow corrupted. After applying a "correction procedure" to the eeprom, my battery percentage now appears to have been recorded at boot time, with the displayed value not changing afterward.

The first time I booted, it showed 48% and stayed there. The second time I booted, it showed 73% - and *STILL* shows as 73%, even though the LED has turned from amber to green (presumably because the battery is now charged).

in reply to: ↑ 8 ; follow-up: ↓ 10   Changed 6 years ago by rsmith

Replying to mikus:

The first time I booted, it showed 48% and stayed there. The second time I booted, it showed 73% - and *STILL* shows as 73%, even though the LED has turned from amber to green (presumably because the battery is now charged).

Please do a discharge/recharge cycle while running olpc-pwr-log and send me the log files. I you have idle-suspend enabled please turn it off.

in reply to: ↑ 9 ; follow-up: ↓ 11   Changed 6 years ago by mikus

Replying to rsmith:

Please do a discharge/recharge cycle while running olpc-pwr-log and send me the log files. I you have idle-suspend enabled please turn it off.

I will not do that unless you give me a convincing explanation of what good it will do.

As far as I am concerned, the current problem (percentage shown not changing) is that the tools (from batman.fth) that I used, had set the battery's eeprom to NOT invoke the periodic interrupt. With the XO software to *recalculate* the charge percentage not being "started", the Frame palette keeps showing the same old percentage value (it's now about 20 hours later, and Frame *still* shows the percentage as 73%). [Note: within about an hour after the battery's eeprom was "cleared", both the 'battery LED' turned green, and olpc-pwr-log (run then) showed battery status as 'Full'.]

There's an attachment to #8690 that shows olpc-pwr-log output for a period right after the 'battery LED' not working was "fixed". That ought to be good enough to convince you that the battery *is* charging properly (so it is the XO software that is NOT reporting charge status properly, possibly because some bit was not set by the batman.fth tools, that should have been set).

Changed 6 years ago by mikus

yesterday's olpc-pwr-log snippet after LED turns green, showing battery charge "Full'.

in reply to: ↑ 10 ; follow-up: ↓ 12   Changed 6 years ago by rsmith

Replying to mikus:

Replying to rsmith:

Please do a discharge/recharge cycle while running olpc-pwr-log and send me the log files. I you have idle-suspend enabled please turn it off.

I will not do that unless you give me a convincing explanation of what good it will do.

It tells me what the EC thinks the SOC is and if its changing and operating in a sane manner. The logs you have pointed to though have enough info to tell me what I wanted to see.

As far as I am concerned, the current problem (percentage shown not changing) is that the tools (from batman.fth) that I used, had set the battery's eeprom to NOT invoke the periodic interrupt. With the XO software to *recalculate* the charge percentage not being "started", the

It doesn't work like that. Batman simply re-initializes the values in the EEPROM to the factory defaults and none of the eeprom values control the 1% SOC change ticks. For a Life battery 95% of the eeprom values are unused.

The 1% SOC change tick delivery is controlled by a mask value that is set by the kernel. If sugar fails to update the SOC then the most common cause so far has been a mask setting that has the ticks disabled. If you enable loglevel 9 and do a kernel suspend/resume the logs will contain the commands that set that mask. So we can see what its getting set to. In many cases the suspend/resume fixes things.

You can also plug and uplug external power. That generates power events which will cause sugar to try and update the battery status. Restarting sugar should also set things to the proper value. If you restart and then it never changes value after that then that further suggests that the SCI's are not happening.

in reply to: ↑ 11   Changed 6 years ago by rsmith

value. If you restart and then it never changes value after that then that further suggests that the SCI's are not happening.

Looking back through the ticket I see this is indeed whats happening. Set your loglevel to 9 and change your syslogd to route everything for kern.* to kern.log then run it for long enough that you should have had some 1% ticks. (olpc-pwr-log can show you when it changes) Right before you quit do a suspend/resume with the power button. Then send up your kern.log

  Changed 6 years ago by mikus

Set your loglevel to 9

Please consider your audience. I have not the slightest idea what a 'loglevel' is, nor how to use it.

Note: See TracTickets for help on using tickets.