Opened 5 years ago

Closed 4 years ago

#9325 closed defect (fixed)

full-screen fills are slow in X

Reported by: cjb Owned by: cjb
Priority: normal Milestone: 1.5-software-later
Component: x window system Version: not specified
Keywords: Cc: dsd, HaraldWelte@…, mikus, sascha_silbe
Blocked By: Blocking:
Deployments affected: Action Needed: code
Verified: no

Description

Full screen color blits are often slow enough to see them happening visually, not sure why. Examples are an activity starting up and redrawing the whole screen for its splash screen, and the GDM fade-in animation.

A good first step would be to use oprofile or similar while the operation is happening. If we end up in unaccelerated paths, work out why we left the fast path.

Attachments (2)

test.py (485 bytes) - added by Quozl 5 years ago.
Test program. On XO-1 os802b1 result is 71.67 flips per second. On XO-1.5 os201 result is 10.10 flips per second.
test2.py (663 bytes) - added by Quozl 5 years ago.
Second test program that attempts to demonstrate if there is any significant change in the frame rate between the first few frames and the remaining frames. Runs of this test show a mild improvement over the test run. XO-1.5 build os114 yields 40 fps, build os116 with SWRandR enabled yields 30 fps, with screen rotated yields 12 fps. XO-1 build os802B1 yields 70 fps. In all tests, the first measurement made was smaller.

Download all attachments as: .zip

Change History (19)

comment:1 Changed 5 years ago by cjb

Another example is changing backgrounds (which invokes a fade out/in) in the GNOME "appearance" control panel.

comment:2 Changed 5 years ago by dsd

  • Cc HaraldWelte@… added

comment:3 Changed 5 years ago by dsd

Various codepaths through the openchrome driver would be improved once we get drm/agp working in #9361

comment:4 Changed 5 years ago by Quozl

Does this problem still reproduce? How slowly are the blits happening?

comment:5 Changed 5 years ago by dsd

It's still just as slow as it was before.

comment:6 Changed 5 years ago by triagebot

  • Milestone changed from 1.5-software to 1.5-software-update

changed by irc user Quozl:

comment:7 Changed 5 years ago by mikus

  • Cc mikus added

#9497 (for Terminal) was closed as a duplicate of #9325 (for full screen).

Changed 5 years ago by Quozl

Test program. On XO-1 os802b1 result is 71.67 flips per second. On XO-1.5 os201 result is 10.10 flips per second.

comment:8 Changed 5 years ago by cjb

  • Action Needed changed from diagnose to test in build

test in os114

comment:9 Changed 5 years ago by Quozl

Tested in os114, 35.63 flips per second up from 10.10. Faster than it was, but still half the speed of an XO-1.

comment:10 Changed 5 years ago by sascha_silbe

  • Cc sascha_silbe added

comment:11 Changed 5 years ago by mikus

Running 'python test.py' :

  Serial       ROM     Build  #flips/sec
  CSN74804910  Q2E42e  802B1      63    (msg:  no protocol specified)
  CSN748028A5  Q2E42e  802B4      68
  CSN74801834  Q2E42e  os10       72    (2010/03/12 kernel)
  CSN750001FB  Q2E42e  os11       74    (2010/03/12 kernel)
  SHC9370111D  Q3A35   os114      56

Note: The *first* time I ran 'python test.py' on any machine, the flips/sec were usually lower. The above numbers are after repeated runs of test.py.

comment:12 Changed 5 years ago by mikus

Some more data points, running 'python test.py' :

Serial       ROM     Build  #flips/sec
CSN74900FB3  Q2E42e  802B5      72
CSN74903BE3  Q2E42e  110-py     74
SHC834024E2  Q2E42e  os13       76    (2010/03/22 kernel)
SHC9370111D  Q3A35   os116      62

What I noted on all the F11 XO-1 machines was that the first time I ran 'python test.py', the flips/sec were in the low 60s. On subsequent runs they were normally in the 70s. I am guessing that the XO-1 video hardware is capable of at least the mid-70s, but that it could be the software implementation that is the bottleneck. Perhaps the first run filled some sort of a "working cache', which then did not need to be refilled by the subsequent runs.

This might explain the somewhat slower performance on the XO-1.5 -- there the flips/sec started around 60, and never improved. Perhaps the XO-1.5 systems were not able to employ any such "working cache" implementation.

Changed 5 years ago by Quozl

Second test program that attempts to demonstrate if there is any significant change in the frame rate between the first few frames and the remaining frames. Runs of this test show a mild improvement over the test run. XO-1.5 build os114 yields 40 fps, build os116 with SWRandR enabled yields 30 fps, with screen rotated yields 12 fps. XO-1 build os802B1 yields 70 fps. In all tests, the first measurement made was smaller.

comment:13 Changed 5 years ago by Quozl

  • Action Needed changed from test in build to code

While the improvements are good, the performance is not yet even on par with XO-1, so I'm pushing this out of test-in-build.

comment:14 Changed 5 years ago by mikus

XO-1, py-115 build:

0 [~]$ python test2.py
Test #0, n=80, elapsed=2.05, fps=39.00
Test #1, n=80, elapsed=1.56, fps=51.40
Test #2, n=80, elapsed=1.07, fps=74.55
Test #3, n=80, elapsed=1.06, fps=75.35
Test #4, n=80, elapsed=1.07, fps=74.44
0 [~]$

XO-1.5, os116 build:

0 [~]$ python test2.py
ALSA lib pulse.c:229:(pulse_connect) PulseAudio: Unable to connect: Connection refused

Test #0, n=80, elapsed=1.40, fps=57.22
Test #1, n=80, elapsed=1.29, fps=62.08
Test #2, n=80, elapsed=1.29, fps=61.95
Test #3, n=80, elapsed=1.29, fps=61.87
Test #4, n=80, elapsed=1.30, fps=61.76
0 [~]$

As I suggested in my earlier comment -- the XO-1.5 fps did not increase much from the first run, whereas the XO-1 fps increases to a noticeably higher number than on the first run.

Also, the XO-1.5 outputs a pulse audio message, whereas the XO-1 doesn't.

comment:15 Changed 5 years ago by jon.nettleton

Just wanted to update that this comparison isn't really valid, as the XO1 is running a color depth of 16bpp and the XO 1.5 is a bit depth of 24bpp. Considering the increase in bit depth and the relatively small loss in performance, I say small because the number of full screen fills is greater than the vrefresh, I don't think this is a performance degradation any longer. If you would like a performance comparison here are the numbers of test2.py run under 16bpp

Test #0, n=80, elapsed=0.72, fps=110.59
Test #1, n=80, elapsed=0.70, fps=113.72
Test #2, n=80, elapsed=0.70, fps=113.97
Test #3, n=80, elapsed=0.70, fps=114.07
Test #4, n=80, elapsed=0.70, fps=113.77

The pulseaudio problem could be various things. Make sure you don't have a custom asound.conf. Remove the ~/.pulse* so alsa doesn't try and connect. check rpm -qa '*pulse*' you should only have pulseaudio-libs-glib2 and pulseaudio-libs. There may be some other places where ALSA may link into pulse. If you still have problems open that as a separate ticket.

comment:16 Changed 5 years ago by Quozl

cjb, dsd, mikus, please comment on whether 60fps of full-screen fill is acceptable.

comment:17 Changed 4 years ago by Quozl

  • Resolution set to fixed
  • Status changed from new to closed

Retested using test2.py attachment.

hardwarerelease versionbuildfpsbpp
XO-18.2.1os8026816
XO-110.1.2-rcos8517216
XO-1.510.1.2-rcos8516224

Slightly slower XO-1.5 performance over XO-1, but considering the bits per pixel increase I'm happy with it. Closing ticket for lack of comment.

Note: See TracTickets for help on using tickets.