Ticket #7958 (closed defect: duplicate)
DCON showed old screen image during suspend, with extra black "dusty" spots
| Reported by: | gnu | Owned by: | dilinger |
|---|---|---|---|
| Priority: | blocker | Milestone: | Future Release |
| Component: | kernel | Version: | not specified |
| Keywords: | Cc: | dilinger, rsmith | |
| Action Needed: | never set | Verified: | no |
| Deployments affected: | Blocked By: | ||
| Blocking: |
Description
This is a rare condition that I think I may have seen or heard of once before.
XO G1G1 MP "Xoroaster", S/N CSN7500230F, Joyride 2263, firmware is a custom special test version by rsmith: "Q20107", a post-Q2E12 but pre-Q2E13 version.
I had just run a power test for rsmith in a terminal activity. The machine was configured for Mesh channel 1 (Simple Mesh) during the test. The machine had suspended at the end of that test, showing that the battery was fully charged. (The screen shows a log of battery status every 10 seconds or so, from the olpc-pwr-log command.) I woke up the suspended system with a keypress, tried to scp and failed, switched the network configuration to use a local access point, and scp'd the files to another machine so I could send them to Richard. I sent the email at about 22:40 (Pacific time) and then got into an extended irc conversation.
When I glanced back at the XO, at 23:07, it was suspended, and the screen was "spotty", with a lot of black dust mixed into the image. But the most interesting part is that the screen image was the image at the end of the battery test -- not including the subsequent commands!
I realized this was probably a DCON issue, and took two photos of the screen. From the IRC log, the image persisted until 23:20, when I had set up a camera to take a video of what happened when I resumed the system with a keypress. Those are attached. My prediction was that the screen would jump to show the correct contents immediately upon resume. Indeed, it did.
My theory is that the system suspended normally during the end of the power test. But its next suspend, 65 seconds after I finished scp-ing the files off the XO, was abnormal. The DCON missed the DCONLOAD signal that should've copied the current screen contents into the DCON's little 1MB DRAM buffer. When the suspend code switched the screen so the DCON would refresh it, it started refreshing from the *prior* contents of that buffer -- with some bit-rot speckles because the DRAM buffer doesn't get refreshed when it isn't in use. That's the theory.
Some time after the above, I captured "dmesg" output and have attached that as well. It seems to have the last four suspends. There are some odd kernel messages, but they're about the CAFE chip, not about the DCON.
In the GMT timezone of the laptop, the last power file was written at 2008-08-14 05:26, and the subsequent dmesg command was at 06:33.
Richard remembers some i2c problems with the CPU talking to the DCON, that were never fully diagnosed; perhaps that's the root cause. He says the EC is not involved unless the DCON needs to be reset. (I didn't see any indication of a DCON reset in the dmesg log, but I don't know what to look for.)
(For contrast, see #2358 for a very early DCONLOAD problem while suspend was originally being debugged.)


