Ticket #4606 (new enhancement)

Opened 6 years ago

Last modified 3 years ago

XO can't resume from suspend at a particular time set by software

Reported by: gnu Owned by: rsmith
Priority: normal Milestone: 9.1.0-cancelled
Component: hardware Version: 1.0 Hardware
Keywords: power Cc: cjb, JordanCrouse, jg, dsaxena, sascha_silbe
Action Needed: design Verified: no
Deployments affected: Blocked By: #6053
Blocking:

Description

The XO hardware doesn't have a wakeup source with more than 1-second granularity or resolution. If the CPU wishes to suspend for 0.568 seconds, it can't. Much worse is that it can't suspend for 3 seconds either; the only way it can resume is on exact 1-second boundaries in the realtime clock (not 1 second boundaries from "now"). It can't even read out sub-second times from the RTC to figure out when the RTC will tick over to the next second.

There are tricks that we could play in the 5536 if we can reuse one or two pins to use a MFGPT to trigger a wakeup. Probably the easiest circumvention is to ask the EC to wake us up after so many milliseconds; its meter is running all the time, even when we're in suspend. Seems like a simple request, but as Jordan says, "nobody in x86 land has ever needed to do anything more drastic than" waking up on 1-second boundaries before. They probably thought their "alarm clock" was pretty cool for being able to wake up on any specified second, rather than just at an HH:MM time.

This limits what we can do for automatic suspend, since we would have to wake up the CPU most of a second before it's needed -- perhaps more than a second early, if kernel code that uses better timers isn't keeping track of how much the RTC has drifted against the system clock.

There's a possibly related bug #3359 (we lose up to a second in Unix's idea of what the time is, whenever we suspend). The bug has a very misleading title about network time servers, but the real bug is that the clock shouldn't lose time when we suspend.

Change History

Changed 6 years ago by cjb

  • cc cjb added

I noticed this limitation in China last week -- having to wake up on a one-second boundary slowed down our RTC wakeup tests.

Changed 6 years ago by jg

  • cc JordanCrouse, jg added
  • type changed from defect to enhancement
  • milestone changed from Never Assigned to FutureFeatures

Jordan, is this really true?

Changed 6 years ago by JordanCrouse

Yes, this is true. With just the traditional X86 components, the only wakeup source we have on a timer is the RTC, which has the 1 sec limitation. On the current hardware, our only alternative is to have a EC timer fire an SC back at us (presumably, the EC has the proper granularity).

With a hardware redesign, you could use the MFGPT timers to wake us up in a round-about way, but that would consume both a timer, and a GPIO, both of which are in short supply in the standby domain.

Changed 6 years ago by gnu

  • milestone deleted

Richard Smith says he can probably help us fix this by using a higher resolution timer in the EC to wake us up. Copying from emails with MLJ:

gnu said:

So much work has been done on this, yet so much remains. When the XO suspends, it can't resume at a preset time, except on 1-second boundaries. So if the kernel knows it has nothing to do for 2.7 seconds, it can't suspend for 2.7 seconds. It can't even suspend for 2 seconds, since it will wake when the battery backup realtime clock ticks into the next second, which will drift against the Unix time. The kernel can (and will have to) track this. I discovered this last week when looking into automatic suspend. Richard Smith can probably fix this by using a higher resolution timer in the EC to wake the system, or by tristating a gpio so we can reuse it for an MFGPT output/feedback-input during suspend.

MLJ said:

I'm sitting next to Richard Smith who agrees about the timing issues.

Trac won't let me keep the milestone as "FutureFeatures"; for somebody with my privileges, there's no such milestone :-) though there's a "Future Release". Personally I think this will need fixing to make suspend-on-kernel-idle useful for much of anything; we can't go to sleep unless we can predict with high probablilty exactly when we'll be waking up (at the latest). It's back to a blank milestone.

Changed 6 years ago by jg

  • milestone set to FutureFeatures

Changed 6 years ago by bemasc

  • milestone deleted

This doesn't appear to block suspend-on-idle, as long as the system has sufficiently rare scheduled events. If the CPU has no events scheduled for the next 5 seconds, it can go to sleep with a wakeup set for 4 seconds later than the current RTC. That wakeup will come somewhere between 3 and 4 seconds in the future, at which point the CPU will resume, and the kernel will have a few hundred milliseconds before it needs to trigger the event.

Changed 6 years ago by jg

  • milestone set to FutureFeatures

Changed 6 years ago by gregorio

  • milestone deleted

Milestone FutureFeatures deleted

Changed 6 years ago by gnu

  • blocking 8094 added

(In #8094) This does not duplicate #2765 (power down DCON chip to save power); it is not even blocked by #2765. What it is blocked by is #4606 (XO can't resume based on a timer).

The X DPMS code would power off the screen and backlight, if its 20-minute timer ever woke it up. But because after a minute we're suspended, and suspend doesn't awaken based on timers, X never gets a chance to do its job. The fix is to resume based on pending timers.

Changed 6 years ago by gnu

  • next_action set to never set

We should do the best we can in software, with the hardware we have. This means figuring out when the next pending timer events are, and setting up a resume source (EC or MFGPT or RTC) to wake us up in time to dispatch the next one.

Changed 6 years ago by wad

  • owner changed from wad to rsmith
  • next_action changed from never set to design
  • version set to Mass Production Hardware
  • milestone set to 9.1.0

This is actually more complicated. One of the lessons learned about the Geode is that the RTC cannot be used to reliably wake up the laptop (unless at least a minute in the future.) The problem is that RTC wakeups at the wrong time (specifically while the laptop is still in the process of suspending itself) will crash it.

The solution is to use the EC as the timer source for wakeups. The EC is aware of the state of the rest of the system, and can avoid trying to wake it up while it is still suspending. This functionality is not supported by the current EC code.

Changed 6 years ago by JordanCrouse

This is common knowlege for most, but just to be clear again for the Google cache, MFGPTs cannot cause a wakeup.

Changed 5 years ago by gnu

  • cc dsaxena added
  • spec_stage set to unknown
  • spec_reviewed set to 0
  • blockedby 6053 added

#6053 also relates to accurate time and suspend (Linux slips up to a second from realtime whenever we suspend), and will need to be fixed to make timed suspends work.

RTC wakeups with less than 60 seconds' delay should be easily possible without danger. I think Wad was not thinking a literal "minute" when he wrote "unless at least a minute in the future". As he says, the danger is if a wakeup occurs DURING a suspend -- we should be able to time our suspends to well within a second, and our wakeups to within two seconds or better.

[As an addendum to Jordan's last comment: An MFGPT cannot cause a wakeup directly. But it can wiggle a GPIO pin. Some of those pins, if not driven from outside, can be internally wired inside the Geode companion chip to cause a wakeup. The XO is not wired that way (all of its GPIO pins are being driven for other things), but other Geode-based devices could be.]

Changed 4 years ago by sascha_silbe

  • cc sascha_silbe added

Changed 3 years ago by martin.langhoff

  • blocking 8094 removed
Note: See TracTickets for help on using tickets.