Ticket #12644 (closed defect: fixed)

Opened 18 months ago

Last modified 17 months ago

vmeta playback is unreliable with the vmetaxv optimization

Reported by: dsd Owned by: dsd
Priority: high Milestone: 13.2.0
Component: x window system Version: not specified
Keywords: Cc:
Action Needed: diagnose Verified: no
Deployments affected: Blocked By:
Blocking:

Description

  1. Install 13.1.0 build 35 on XO-1.75.
  2. Install libvmeta-marvell and gstreamer-plugins-marvell
  3. Download http://techslides.com/demos/sample-videos/small.mp4 and play in totem

This works fine on 12.1.0, but fails on 13.1.0 build 35 (with the same vmeta packages in step 2). Only the first frame is shown. The audio does play. Then totem hangs at the end of playback.

Change History

Changed 18 months ago by dsd

Tested versions gstreamer-plugins-marvell-0.10-3.olpc and libvmeta-marvell-005-1.olpc

Changed 18 months ago by dsd

"gst-launch playbin2 uri=" does work on both versions, but it is unreliable (on both setups), frequently failing with xcb errors.

Changed 18 months ago by dsd

This fails with the same message (after a few tries):

gst-launch filesrc locaion=small.mp4 ! decodebin ! vmetaxvimagesink

But with normal xvimagesink things look OK:

gst-launch filesrc locaion=small.mp4 ! decodebin ! xvimagesink

Changed 17 months ago by dsd

  • summary changed from vmeta mp4 playback not working on 13.1.0 to vmeta playback is unreliable with the vmetaxv optimization

This is the likely cause of the vmeta instability we have seen in the past where some videos play fine in totem but not jukebox, or that some videos only play sometimes, etc.

Changed 17 months ago by dsd

  • next_action changed from never set to test in build

By adding some extra messages in libX11 where the crash was happening I could see that two threads were doing simultaneous X communication in the crash case. Bad idea.

Catching the "intruding thread" in gdb lead me to a part of the codebase that runs outside of the X lock.

Fixed in gst-plugins-vmetaxv-0.10.36.3.

Changed 17 months ago by dsd

  • next_action changed from test in build to diagnose

Its more stable but I already saw two hangs of the same type.

On the 2nd one, I tried to click pause and that hung as well at:

#0  0x40f52e10 in ?? () from /lib/libpthread.so.0
#1  0x40f4db7c in pthread_mutex_lock () from /lib/libpthread.so.0
#2  0x40e9687c in g_mutex_lock () from /lib/libglib-2.0.so.0
#3  0x46974d20 in gst_vmetadec_play2pause (vmetadec=0x44c18d48)
    at gstvmetadec.c:4281
#4  gst_vmetadec_change_state (element=<optimized out>, 
    transition=<optimized out>) at gstvmetadec.c:4334

Changed 17 months ago by dsd

The vmetaxvimagesink plugin is at fault and removing it will work around this bug (with a certain loss in performance).

The vmetaxv plugin tries to speed up video playback, by realising when video frames that need to be shown on-screen have come from the vmeta decoder. This means that they already have a known physical address. So there is no need to copy such image data into a new buffer (as xvimagesink would do), we can just pass the physical address directly and have the image shown on screen.

This sounds sensible but the implementation really isn't. It is all done through Xv, but instead of copying the image data into a buffer passed to Xv, it simply sets a magic bytestring and writes the physical address where the image data normally would be, then calls XvShmPutImage(). The X driver receives this, notices the magic, handles it correctly, and changes the magic value to a second value to be passed back to the client. The client realises that the magic has changed and goes ahead and frees the buffer.

The problem here is that XvShmPutImage() sometimes returns without drawing the image on screen, without even invoking the Xv part of the X driver. It commonly does this as totem is starting up, the first few calls don't seem to "make it through" to the driver, maybe the video area is not visible yet. So the magic value does not get changed, which causes the vmetaxv driver not to free the buffer. This happens more than 16 times during totem startup, which means we run out of buffers, and since we never free them, things go wrong from that point.

It does not seem possible to detect if XvShmPutImage() actually sent its operation to the driver or not. There are times when the driver does not receive the request during the XvShmPutImage() call and the frame doesn't even get drawn in the time that follows, so busy looping waiting for the draw is not realistic (it would have to have a nasty timeout). Also I have seen cases during totem startup where it returns before the driver has drawn it, but then the driver does come along and draw it later. This system is inherently racy and needs a redesign (hopefully a better design...).

For now I think it is safe for the vmetaxv code simply to free the buffer after sending to to XvShmPutImage(), even if the Xv driver has not sent it to the screen (i.e. the magic value has not been changed). As I have seen cases where the drawing is simply delayed, this does introduce a risk of drawing stale data to the screen. But at least the stale data comes from a known memory area and I don't see this being problematic.

Changed 17 months ago by dsd

  • status changed from new to closed
  • resolution set to fixed

Fixed in gstreamer-plugins-vmetaxv-0.10.36.4

After an hour of testing playing back a short video in a loop in totem on 2 XO-1.75s, this seems to be stable and working.

Note: See TracTickets for help on using tickets.