Opened 5 years ago

Closed 4 years ago

Last modified 4 years ago

#10068 closed defect (fixed)

Black Xv overlay when lid-close suspending while Record is running

Reported by: cjb Owned by: jon.nettleton
Priority: normal Milestone: 11.2.0-M4
Component: kernel Version: not specified
Keywords: viafb Cc:
Blocked By: Blocking:
Deployments affected: Action Needed: diagnose
Verified: no

Description (last modified by Quozl)

In the Record activity, after a lid-close suspend and resume, the Xv overlay with live camera image is black.

ticket description was previously:

Sometimes, when starting Record with Xv enabled, we're seeing a black screen for the overlay window instead of the wanted camera data. When I switch to the VT console and then back, the overlay fixes itself.

Test in os120.

Attachments (7)

LeaveVT-nosleep (3.3 KB) - added by jon.nettleton 5 years ago.
EnterVT-sleep (3.3 KB) - added by jon.nettleton 5 years ago.
LeaveVT-sleep (3.3 KB) - added by jon.nettleton 5 years ago.
EnterVT-nosleep (3.3 KB) - added by jon.nettleton 5 years ago.
log1-fresh (17.4 KB) - added by cjb 5 years ago.
log2-after-suspend (17.4 KB) - added by cjb 5 years ago.
log3-after-VT-switch-after-suspend (17.4 KB) - added by cjb 5 years ago.

Download all attachments as: .zip

Change History (39)

comment:1 Changed 5 years ago by cjb

(I think it might be correlated with a suspend having happened, but I'm not sure.)

comment:2 Changed 5 years ago by jon.nettleton

This is definitely broken during a suspend resume cycle. Attached are 4 register dumps from LeaveVT EnterVT before suspend and after suspend. The LeaveVT-sleep dump is the broken one.

Changed 5 years ago by jon.nettleton

Changed 5 years ago by jon.nettleton

Changed 5 years ago by jon.nettleton

Changed 5 years ago by jon.nettleton

comment:3 Changed 5 years ago by jon.nettleton

There are a bunch of registers that are changed between the broken code and the working code. Is it possible to just invoke a VT switch away on suspend and back on resume to invoke the suspend code that already exists in the driver? If the DCON is frozen this should be seamless to the user.

comment:4 Changed 5 years ago by pgf

the switch takes time. this was probably more significant on XO-1 than on 1.5, but my impression is that it was significant on XO-1.

comment:5 Changed 5 years ago by jon.nettleton

On this hardware it is not so much of an issue. Freezing and unfreezing the DCON is much slower.

time chvt 1

real 0m0.037s
user 0m0.000s
sys 0m0.000s

time chvt 3

real 0m0.046s
user 0m0.000s
sys 0m0.000s

time (chvt 1 && chvt 3)

real 0m0.066s
user 0m0.000s
sys 0m0.000s

time ((echo 1 > freeze) && (chvt 1 && chvt 3) && (echo 0 > freeze))

real 0m0.568s
user 0m0.000s
sys 0m0.010s

comment:6 Changed 5 years ago by cjb

We're also unsure how performing VT switches all the time (when aggressive suspend happens) will affect userspace -- we don't want to do anything to break the illusion of the suspend happening mostly invisibly.

OFW's able to restore some VX855 registers on resume already; I think we can just work out exactly which others to add (and their values) for Xv using the register dump.

comment:7 Changed 5 years ago by pgf

that half-second DCON switch is a B2 issue.
is there a C2 machine on its way to you? DCON freeze/unfreeze was problematic on B2, due to the choice of a surprisingly restricted (unusable) interrupt input.

# time chvt 1

real 0m0.036s
user 0m0.000s
sys 0m0.000s

# time chvt 3

real 0m0.036s
user 0m0.000s
sys 0m0.000s

# time (chvt 1 && chvt 3) [ this isn't always reliable -- i.e., it doesn't always switch back. :-/ ]

real 0m0.054s
user 0m0.000s
sys 0m0.010s

# time ((echo 1 > freeze) && (chvt 1 && chvt 3) && (echo 0 > freeze))

real 0m0.072s
user 0m0.000s
sys 0m0.020s

comment:8 Changed 5 years ago by wmb@…

OFW does indeed establish the values of quite a few display-related registers on resume - but for the most part it sets them to known-good values instead of trying to do the save/restore dance.

Results of analyzing register diffs between LeaveVT-sleep and EnterVT-sleep:

a) Some registers should only matter when using a legacy VGA display mode - i.e. irrelevant in a frame buffer mode: SR04,15,16,1a,1c,1d CR00-0A,10-18,33,35 GR05,06,07 AR10-13 Misc bit 5

b) Some registers apply to the primary display controller, unused for the XO: SR 44,45,46

c) OFW uses slightly different timing (56.2 MHz) than the Linux driver (56.8 MHz): SR4a,4b,4c CR56,57m5em5f

d) Others: CR36 affects nonexistent external CRT. CR6a is for indexed color mode.

e) Read-only registers: SR3d,3e,5b,5c CR60,61

f) LVDS stuff and LCD power sequencing that is irrelevant for OLPC: CR88,8a-90,99

g) Performance tuning: SR58,59 CR68,94,95

Misc bit 7 - vertical sync polarity - might possibly be a problem.

The value of CR9B (0x00) in EnterVT-sleep seems blatantly wrong. When I set that register to 0 in OFW, I lose the raster entirely. The correct value 0x1b selects the secondary display engine as the data source to feed to the DCON.

comment:9 Changed 5 years ago by cjb

Here are three more dumps. They are:

  • after a fresh reboot
  • after a suspend, in the broken-overlay state
  • after a VT switch after a suspend

I used Harald's via-chrome-tool to take the dumps, so they are probably a slightly different register set to Jon N's.

Changed 5 years ago by cjb

Changed 5 years ago by cjb

Changed 5 years ago by cjb

comment:10 Changed 5 years ago by Quozl

  • Action Needed changed from never set to add to build

comment:11 Changed 5 years ago by Quozl

  • Action Needed changed from add to build to diagnose

Tested with q3a36a, started Record, saw myself, closed lid, waited for suspend, opened lid, waited for resume, Xvideo overlay was black. Failed test.

comment:12 Changed 5 years ago by jon.nettleton

  • Component changed from x window system to kernel
  • Keywords viafb added

This problem is caused by the CR00-CR05 not being preserved across suspend resume. After talking with Mitch we think that the viafb suspend/resume cycles should be responsible for storing and restoring the Control and Sequence registers instead of doing this in OFW.

For performance reasons Mitch suggested the registers be accessed through MMIO space.

<Mitch_Bradley> I believe that it is possible to access the CRT and SEQ registers through MMIO space, using offets 0x83cX and 0x83dX, instead of through I/O ports 3cX and 3dX, thus (probably) making it go faster I/O port accesses are inherently quite slow, to the tune of nearly 1 uS per access, thus costing a total of 4 uS for every such register you want to save/restore

comment:13 Changed 5 years ago by cjb

  • Priority changed from normal to blocker

Bumping up to blocker for the upcoming 10.1.1 release; would like to get this fixed.

comment:14 Changed 5 years ago by cjb

  • Action Needed changed from diagnose to test in build
  • Description modified (diff)

comment:15 Changed 5 years ago by cjb

Weird, I thought I'd typed "Test in os120."

comment:16 Changed 5 years ago by Quozl

  • Action Needed changed from test in build to diagnose

Tested in os120, a lid-close suspend with Record active still results in Record live image black on resume.

(The workaround of switching to VT1 and then VT3 still works; but the camera LED switches at that time as well, so I presume Record is stopping and starting the pipeline.)

comment:17 follow-up: Changed 5 years ago by cjb

That's a different test -- the use case this bug is supposed to fix is more like:

  • boot machine
  • machine idle-suspends
  • wake up machine
  • start Record
  • overlay should not be black

Does that case work for you? If so, we have a different bug of "Record breaks if you suspend/resume while it's streaming" (which isn't as surprising).

comment:18 in reply to: ↑ 17 Changed 5 years ago by Quozl

  • Action Needed changed from diagnose to review

Replying to cjb:

  • boot machine
  • machine idle-suspends
  • wake up machine
  • start Record
  • overlay should not be black

Yes, this works on os121.

comment:19 Changed 4 years ago by dsd

confirmed in os121:

Idle suspend, resume, start Record, camera display is working OK (original bug fixed)

Start Record, close lid, open lid, camera display is black (newly observed bug is present)

workaround for new bug: close Record, open again.

comment:20 Changed 4 years ago by Quozl

Newly observed bug confirmed in os122 with Record-76. (Start Record, close lid, open lid, camera display is black).

Original bug confirmed fixed in os122.

comment:21 follow-up: Changed 4 years ago by mikus

Once in Record-76 on os122 at 'High' quality, the initial frame(s) recorded on the video clip were of a jagged "out-of-synch" visual pattern. [I already saw this when *making* the clip - it happened just after having clicked on the button, as the progress bar below the screen started being drawn.] Within a fraction of a second the (moire?) pattern was replaced by the correct picture, without me having to do anything (e.g., I did not touch the lid).

comment:22 Changed 4 years ago by cjb

  • Priority changed from blocker to high
  • Summary changed from Black Xv overlay when starting Record to Black Xv overlay when lid-close suspending while Record is running

comment:23 in reply to: ↑ 21 Changed 4 years ago by Quozl

Replying to mikus:

Once in Record-76 on os122 at 'High' quality, the initial frame(s) recorded on the video clip were of a jagged "out-of-synch" visual pattern.

That has been fixed, see #10145.

comment:24 Changed 4 years ago by Quozl

Faster workarounds for Black Xv overlay when lid-close suspending while Record is running:

  • Alt/Tab out of the activity and back into it, or;
  • use F1, F2, or F3 and then F4.

Why: Record connects a handler for the visibility-notify-event and detects fully obscured condition, stopping all pipes, and then restarting them when no longer obscured.

comment:25 Changed 4 years ago by Quozl

  • Action Needed changed from review to diagnose
  • Description modified (diff)

Verified in os125; idle suspend does not happen with Record showing live camera video, and directed suspend using the power button switches away from X, causing Record to shutdown the pipes. The symptom now only occurs with lid-close suspend and resume. A quick hack might be to switch away from X for lid-close suspend.

comment:26 Changed 4 years ago by edmcnierney

  • Priority changed from high to normal

Since the use case is now limited to lid-close suspend when recording (anomalous behavior), dropping priority to normal per review with cjb.

comment:27 Changed 4 years ago by dsd

I was struggling with a similar problem for XO-1 and got totally distracted because I thought the problem was in the video driver, but it was a bug in the camera driver.

In the XO-1.5 case on this ticket, it's definitely an Xv issue. Simple test case to reproduce this bug:

stop prefdm
xinit /usr/bin/gst-launch v4l2src ! xvimagesink

Perform a lid-suspend and resume while the video is on-screen. It will go black.

Now repeat the same using regular X rendering:

stop prefdm
xinit /usr/bin/gst-launch v4l2src ! ffmpegcolorspace ! ximagesink

After lid-suspend + resume, the image is still OK.

comment:28 Changed 4 years ago by Quozl

  • Milestone changed from 10.1.1 to 10.1.2

comment:29 Changed 4 years ago by Quozl

  • Milestone changed from 10.1.2 to 10.1.3

comment:30 Changed 4 years ago by dsd

  • Milestone changed from 11.2.0-M3 to 11.2.0-M4

comment:31 Changed 4 years ago by dsd

  • Resolution set to fixed
  • Status changed from new to closed

Fixed by extended viafb S/R code in olpc-2.6.35. confirmed working in 11.2.0 build 11 on XO-1.5.

comment:32 Changed 4 years ago by martin.langhoff

Great! Happy happy happy.

Note: See TracTickets for help on using tickets.