Opened 6 years ago

Closed 6 years ago

#7603 closed defect (fixed)

2.6.25 audio performance regression

Reported by: dsd Owned by: ApprovalForUpdate
Priority: high Milestone: 8.2.0 (was Update.2)
Component: kernel Version: not specified
Keywords: joyride-2181:- Cc: dsaxena, cjb, jg, mstone, veplaini, gregorio, kimquirk
Blocked By: Blocking:
Deployments affected: Action Needed: test in release
Verified: no

Description

In joyride, TamTam doesn't work very well. Even when you start TamTamMini and just start a drum beat without playing any instruments, you get quite regular crackles/blips. At the same time as these imperfections, console messages (from alsa-lib?) appear, stating that underruns have occurred.

Downgrading to stable kernel 2.6.22 works around the problem. Sound output is smooth and no underrun messages appear.

Attachments (3)

kernel_group_schedule.patch (753 bytes) - added by dsaxena 6 years ago.
Enable group scheduling in kernel
tamtam_sched_rr.patch (636 bytes) - added by dsaxena 6 years ago.
Call sys_stuff->setscheduler() to enable SCHED_RR policy for TamTam applications
test-midi.tar.gz (1.8 MB) - added by veplaini 6 years ago.
This is a test package for MIDI file playback using a csound soundfont engine, for running at the terminal

Download all attachments as: .zip

Change History (81)

comment:1 Changed 6 years ago by dsd

I copied the CS5535 driver from stable, and also the whole of include/sound, sound/core, sound/pci/ac97 and sound/seq. There are a still a lot of blips and underruns. Looks like this is caused by something outside of sound :(

Renicing tamtam helps a bit, but even at priority -20 it does not perform as well as update1

comment:2 Changed 6 years ago by dsd

2.6.26 has the same problem

comment:3 Changed 6 years ago by tomeu

I remember that the tamtam guys did a big effort to get smooth sound. Something changed at the scheduler, perhaps?

comment:4 Changed 6 years ago by dsaxena

  • Cc cjb added

I've reproduced the problem, and I'm also seeing (well, hearing) the same issue when I install 2.6.22-20080623.1.olpc.28f4cb6e780db07, the latest stable kernel. I don't see message from alsa-lib.

From looking at the kernel output and top, it looks like the issue is mainly showing up when hald queries the battery status. Everytime I hear the the crackling/pops, I'm seeing the
following:

[  476.152429] olpc-ec:  Task hald (1586) running cmd 0x15
[  476.158343] olpc-ec:  received 0x13
[  476.159743] olpc-ec:  Task hald (1586) running cmd 0x15
[  476.168343] olpc-ec:  received 0x13
[  476.178821] olpc-ec:  Task hald (1586) running cmd 0x16
[  476.188343] olpc-ec:  received 0x62
[  476.191384] olpc-ec:  Task hald (1586) running cmd 0x15
[  476.198343] olpc-ec:  received 0x13
[  476.202273] olpc-ec:  Task hald (1586) running cmd 0x10
[  476.208343] olpc-ec:  received 0x55
[  476.208343] olpc-ec:  received 0xc4
[  476.211444] olpc-ec:  Task hald (1586) running cmd 0x15
[  476.218343] olpc-ec:  received 0x13
[  476.222644] olpc-ec:  Task hald (1586) running cmd 0x11
[  476.228343] olpc-ec:  received 0x0
[  476.228343] olpc-ec:  received 0x2
[  476.231064] olpc-ec:  Task hald (1586) running cmd 0x15
[  476.238343] olpc-ec:  received 0x13

Killing hald, on both 2.6.22 and 2.6.25 makes much of the crackling go away. If I do a 'cat' on any file in /sys/devices/platform/olpc-battery.0/power_supply/olpc-battery that triggers an EC command over and over while hald is stopped (or do cat /sys/devices/platform/olpc-battery.0/power_supply/olpc-battery/*), I hear the crack/pop/skip.

Looking at the EC code, it's not completely surprising we're seeing this as the EC command path disables IRQs and then has various mdelay() calls which just spin during the delay period. This would explain why nice does not really make a difference as we're just not going to get rescheduled while IRQs are disabled.

What surprises me is:

1) I'm not seeing the same alsa messages as dsd. Am I doing something wrong ((I've enabled level 9 in /proc/kernel/printk and grep for alsa or underrun in /var/log/*" returns nothing)? Am I running into a different issue?

2) Given that I can reproduce in 2.6.22 and the EC code is basically identical, why did we not catch this in earlier releases?

I've also noticed that when I first startup TamTamMini, jffs2_garbage_collect_thread is running and I hear very garbled audio. Also, when hald is stopped, I think I still get problems once in a while, and looking at top, it seems to be when pdflush is running. I say "I think" b/c the drum beat samples are not very smooth to begin with, so I'm not really sure if I'm hearing something erroneous, if it is just the sample, or it's just my hearing not being that great. I'm going to setup another test case, playing either real music [no offense to TamTam developers :)] or a continuous tone, so it is much easier for me to detect erroneous output.

comment:5 follow-up: Changed 6 years ago by dsaxena

Was looking at the wrong place for the ALSA underrun messages. I ran the activity via "sugar-launch TamTamMini" in terminal and grabbed stdin & stderr to see the application log. There is definitely a correlation between hald running a battery status query and an ALSA underrun occurring on both .25 and .22 kernels. I still see underruns when hald is disabled, but a drastically smaller number of them and almost unnoticeable to the ear. With hald running,
I'll see several back to back, whereas w/o I'll see maybe one every few minutes if that.

I definitely still hear some crackling w/o hald every so often, but they are not related to hald.

Also, I can run 'aplay <.wav file>' from the command line and not hear any issues on both .22 and .25 kernels even if I do:

while true
do
   cat /sys/devices/platform/olpc-battery.0/power_supply/olpc-battery/*
done

If I do the above while running TamTamMini, the audio is very crackly and there are a lot underruns (again, with both .25 and .22 kernels).

In summary, I don't think this is completely a kernel issue but has to do with the way that TamTam, or csound, or the combination is feeding the samples to the underlying device.

I also don't really see much difference in behaviour between .22 and .25.

comment:6 in reply to: ↑ 5 ; follow-up: Changed 6 years ago by dsd

Replying to dsaxena:

Was looking at the wrong place for the ALSA underrun messages. I ran the activity via "sugar-launch TamTamMini" in terminal and grabbed stdin & stderr to see the application log. There is definitely a correlation between hald running a battery status query and an ALSA underrun occurring on both .25 and .22 kernels.

Agreed.

I definitely still hear some crackling w/o hald every so often, but they are not related to hald.

Yeah. At least for me, the non-hald crackles do not occur (or occur extremely rarely) on 2.6.22 though. On 2.6.25 they aren't too bad, but I am seeing more than one every few minutes.

Also, I can run 'aplay <.wav file>' from the command line and not hear any issues on both .22 and .25 kernels even if I do:

while true
do
   cat /sys/devices/platform/olpc-battery.0/power_supply/olpc-battery/*
done

Yeah, I made the same observation earlier.

If I do the above while running TamTamMini, the audio is very crackly and there are a lot underruns (again, with both .25 and .22 kernels).

Same here.

In summary, I don't think this is completely a kernel issue but has to do with the way that TamTam, or csound, or the combination is feeding the samples to the underlying device.

The difference is that TamTam is multitracking; it is interleaved sound. ALSA function snd_pcm_write()

aplay is just a simple sound stream.

I guess a sensible place to start attacking this would be to find out why hald in update1 does not continually query the battery status. This is why we didn't have the more severe crackling before.

comment:7 follow-up: Changed 6 years ago by cjb

I guess a sensible place to start attacking this would be to find out why hald in update1 does not continually query the battery status. This is why we didn't have the more severe crackling before.

There's no need to query the battery status more than once; we get an interrupt (EC SCI) when it changes.

comment:8 in reply to: ↑ 7 Changed 6 years ago by dsaxena

Replying to cjb:

I guess a sensible place to start attacking this would be to find out why hald in update1 does not continually query the battery status. This is why we didn't have the more severe crackling before.

There's no need to query the battery status more than once; we get an interrupt (EC SCI) when it changes.

Hmm...but if we are plugged in and fully charged (basically how I always run), we shouldn't see a SOC interrupt and therefore never query the battery. I'll debug/look at code to see if hald is indeed only querying on an interrupt or a timer of some sort.

comment:9 in reply to: ↑ 6 Changed 6 years ago by dsaxena

Replying to dsd:

In summary, I don't think this is completely a kernel issue but has to do with the way that TamTam, or csound, or the combination is feeding the samples to the underlying device.

The difference is that TamTam is multitracking; it is interleaved sound. ALSA function snd_pcm_write()

aplay is just a simple sound stream.

Looking at an strace, both TamTam and aplay don't call sys_write() to send frames to the driver but instead are doing a ioctl(SNDRV_PCM_IOCTL_WRITEI_FRAMES); however, both this and sys_write() end up calling snd_pcm_lib_write() to the actual write.

I guess a sensible place to start attacking this would be to find out why hald in update1 does not continually query the battery status. This is why we didn't have the more severe crackling before.

Do you want to look into this while I look into the kernel/tamtam interaction? Even with hald running, we should be able to send interleaved streams to the device.

comment:11 Changed 6 years ago by cjb

Sigh. Yes, looks like it. Well done on finding it.

comment:12 Changed 6 years ago by dsd

since there are no events about changes via ACPI or from udev/kernel

Is that not true for our setup because we have that SCI interrupt thing? Just wondering how this ever worked before... the sugar home screen on update1 managed to show proper battery level

comment:13 Changed 6 years ago by cjb

Correct, it's untrue for us. Our battery SCIs make it up to HAL on an interrupt basis, and Sugar listens for them there.

comment:14 Changed 6 years ago by dsaxena

Replying to dsd:

I think this is responsible:
http://gitweb.freedesktop.org/?p=hal/hal.git;a=commit;h=7430beeb6c6fd6c8e51c24df20fd53c526aed6e8

So looks like couple of options here:

  • Back out the above change so we can do a custom hald build for now. Not the best long term solution but would gives us something that works now...
  • Fixup the above in HAL so that if the battery device does have a uevent, we don't poke at it continuously.
  • The EC code should probably rewritten to it does not hold a lock and disable IRQs for so long, but this is also a long-term project, not something to get done by next wed's build.

I'm looking into the non-hald pops right now. Installing update.1 and trying to see what has changed in TamTam. One commit that stands out:

ommit 35fdf3ba130c95f2528b38b54f6e4c9448540f4c
Author: James <olpc@localhost.localdomain>
Date:   Wed Feb 21 04:35:16 2007 -0500

    adding self-scheduler

diff --git a/Util/Clooper/aclient.cpp b/Util/Clooper/aclient.cpp
index f53655a..d5da736 100644
--- a/Util/Clooper/aclient.cpp
+++ b/Util/Clooper/aclient.cpp
@@ -5,6 +5,7 @@
 #include <time.h>
 #include <unistd.h>
 #include <sys/time.h>
+#include <sched.h>

 #include <vector>
 #include <map>
@@ -18,7 +19,7 @@


 unsigned int SAMPLE_RATE = 16000;
-snd_pcm_uframes_t PERIODS_PER_BUFFER = 4;
+snd_pcm_uframes_t PERIODS_PER_BUFFER = 2;
 snd_pcm_uframes_t PERIOD_SIZE = (1<<8);

 static int setparams (snd_pcm_t * phandle )
@@ -52,6 +53,23 @@ static int setswparams(snd_pcm_t *phandle)
     return 0;
 }

+static void setscheduler(void)
+{
+       struct sched_param sched_param;
+
+       if (sched_getparam(0, &sched_param) < 0) {
+               printf("Scheduler getparam failed...\n");
+               return;
+       }
+       sched_param.sched_priority = sched_get_priority_max(SCHED_RR);
:                                                +       if (!sched_setscheduler(0, SCHED_RR, &sched_param)) {
+               printf("Scheduler set to Round Robin with priority %i...\n", sched_param.
+               fflush(stdout);
+               return;
+       }
+       printf("!!!Scheduler set to Round Robin with priority %i FAILED!!!\n", sched_para
+}
+
 static double pytime(const struct timeval * tv)
 {
     return (double) tv->tv_sec + (double) tv->tv_usec / 1000000.0;
@@ -390,6 +408,8 @@ struct TamTamSound
             buf[i*2] = buf[i*2+1] = 0.5 * sin( i / (float)nframes * 10.0 * M_PI);
         }

+        setscheduler();
+
         while (PERF_STATUS == CONTINUE)
         {
             int err = 0;

comment:15 Changed 6 years ago by dsd

I have some familiarity with HAL; I am going to attempt to cook up a patch which allows us to say "OLPCs battery gives updates, no need to poll" which should be upstream-suitable.

comment:17 follow-up: Changed 6 years ago by dsd

On my XO with those HAL patches applied, I am still getting a lot of underruns on 2.6.25 but not on 2.6.22.

I quantified this by running tamtam and playing a drum beat (increasing beats per bar and randomness to about 75%) for approximately one minute, while logging stdout. Then I grep for how many underruns occurred. I repeated 3 times for each kernel.

On 2.6.25: 99, 83, 96 (average 93 underruns)
On 2.6.22: 17, 10, 11 (average 13 underruns)

No other changes were made to the system, and HAL was not polling the battery while this was happening.

10-15 underruns occur during standard loading of tamtam, before you are playing anything. Almost all of the reported underruns for 2.6.22 were from there, and not actually skips while audio was playing.

comment:18 in reply to: ↑ 17 Changed 6 years ago by dsaxena

Replying to dsd:

On my XO with those HAL patches applied, I am still getting a lot of underruns on 2.6.25 but not on 2.6.22.

I quantified this by running tamtam and playing a drum beat (increasing beats per bar and randomness to about 75%) for approximately one minute, while logging stdout. Then I grep for how many underruns occurred. I repeated 3 times for each kernel.

On 2.6.25: 99, 83, 96 (average 93 underruns)
On 2.6.22: 17, 10, 11 (average 13 underruns)

Can you run top in another terminal and see what else is happening?

comment:19 Changed 6 years ago by dsd

Nothing jumps out at me, nothing is really using much CPU, not even TamTam.

comment:20 Changed 6 years ago by dsaxena

Getting back to this after being sidtracked by some other things. In researching Clooper and the history of the project a bit more, I found #5645 and from the comment "Clooper solves the problem for the XO because it is fine tuned to the specific audio hardware of the XO. " it is likely there may be some magic code that is tuned not only for the HW but for the specific kernel.

Something I noticed is that commit 35fdf3ba130c95f2528b38b54f6e4c9448540f4c added setscheduler() method to the aclient.c file and this function was called in TamTamSound.thread_fn() before entering the main loop to set the sched policy to SCHED_RR. This function is now in the SystemStuff class defined in audio.cpp and we never call sys_stuff->setscheduler() from the main thread, so we're using the default scheduling policy; however, we're not impacted by this on 2.6.22, so there's still something else that needs to be tweaked.

comment:21 Changed 6 years ago by rsmith

I mentoned this to dsaxena yesterday but I'm adding it to the ticket.

I have new EC firmware that speeds up the EC command processing by a minimum of 3x and normally by about 10x. You can test it from

http://dev.laptop.org/~rsmith/q2089.rom

If this helps to solve the issue then I'll add it into the the firmware and do an E13 release.

Whats our audio frame size? If its >= 512 bytes then a single frame of audio should be longer than most of the EC command processing duration.

comment:22 Changed 6 years ago by cjb

  • Keywords blocks?:8.2.0 added
  • Priority changed from normal to high

comment:23 Changed 6 years ago by kimquirk

This should definately be in investigation. If the EC fixes it, great. Need some real data on the regression to determine if it is blocking.

comment:24 follow-up: Changed 6 years ago by dsd

The new EC code seems to make a small improvement - jitters sound to be reduced while stuff is messing with the battery. It does not affect the main outstanding issue of non-EC related crackles, unsurprisingly.

I did make a new and interesting observation though: this bug is not as bad as we think. If I run TamTamMini from screen and then close the Terminal activity, underruns drop from 93 to an average of 8 per minute. It is still not as good as 2.6.22/update1, but under normal circumstances this bug is not as severe as I thought.

comment:25 in reply to: ↑ 24 Changed 6 years ago by dsaxena

Replying to dsd:

I did make a new and interesting observation though: this bug is not as bad as we think. If I run TamTamMini from screen and then close the Terminal activity, underruns drop from 93 to an average of 8 per minute. It is still not as good as 2.6.22/update1, but under normal circumstances this bug is not as severe as I thought.

This is more evidence to me that the issues are scheduling related.

comment:26 follow-up: Changed 6 years ago by gregorio

  • Cc jg added

Hi Guys,

Can we fix it?

If there is something else that wont get attention if this is worked on, let me know what it is and we can make a priority call.

I'm not inclined to promote this to blocker:8.2.0 yet but I would hate to lose any TamTam functionality. I know that teachers in Uruguay use this in class!

Thanks,

Greg S

comment:27 Changed 6 years ago by dsd

We don't know the cause of the problem or how to fix it, but we do have some ideas for next diagnosis steps. Deepak wrote yesterday that he's working on this as his primary kernel issue.

comment:28 in reply to: ↑ 26 Changed 6 years ago by dsaxena

Replying to gregorio:

Hi Guys,

Can we fix it?

As mentioned by dsd, we need to root cause and I'm working on root causing that. I've
unfortunately been massively sidetracked by some personal non-work stuff this week :(
but getting back to focus now.

comment:29 Changed 6 years ago by veplaini

I can confirm that this happens with Csound+its own ALSA module (/usr/lib/csound/plugins/librtalsa.so) too, so it is not CLooper-only.
I will keep testing the latest builds to see if any improvements
show up. Thanks for working on this.
If any of you like I can prepare a simple test package (CSD, etc...),
which you can run on the terminal and check performance. That might
be the simplest way of testing.

comment:30 Changed 6 years ago by dsd

That would totally rock and if it could give some numeric output, we could even continue to test it in future, and maybe even add it to the tinderbox tests (http://tinderbox.laptop.org/). Thanks!

comment:31 follow-ups: Changed 6 years ago by dsaxena

  • Action Needed changed from never set to diagnose

I did some Googling around of "ALSA underrun scheduler" and found https://bugs.launchpad.net/ubuntu/+source/pulseaudio/+bug/190754 and https://bugs.launchpad.net/ubuntu/+source/linux/+bug/188226. By enabling group scheduling and forcing all applications to be in one group by disabling per-UID scheduling groups, I can run TamTamMini for 10+ minutes with an average of < 1 underrun per second.

I also think that we should put the call to SCHED_RR back into Clooper. It looks to me like this got dropped at some point and my (rudimentary so far) testing shows further improvement with
this scheduling policy enabled (I've gone 10 minutes with zero underruns at one point)

Patches will follow.

Changed 6 years ago by dsaxena

Enable group scheduling in kernel

Changed 6 years ago by dsaxena

Call sys_stuff->setscheduler() to enable SCHED_RR policy for TamTam applications

Changed 6 years ago by veplaini

This is a test package for MIDI file playback using a csound soundfont engine, for running at the terminal

comment:32 Changed 6 years ago by dsaxena

I've uploaded a test kernel and modules to http://dev.laptop.org/~dsaxena/kernel-tamtam-test.tgz.

comment:33 in reply to: ↑ 31 ; follow-up: Changed 6 years ago by veplaini

Replying to dsaxena:

I did some Googling around of "ALSA underrun scheduler" and found https://bugs.launchpad.net/ubuntu/+source/pulseaudio/+bug/190754 and https://bugs.launchpad.net/ubuntu/+source/linux/+bug/188226. By enabling group scheduling and forcing all applications to be in one group by disabling per-UID scheduling groups, I can run TamTamMini for 10+ minutes with an average of < 1 underrun per second.

I also think that we should put the call to SCHED_RR back into Clooper. It looks to me like this got dropped at some point and my (rudimentary so far) testing shows further improvement with
this scheduling policy enabled (I've gone 10 minutes with zero underruns at one point)

Patches will follow.

This sounds like an excellent idea. There is a wealth of knowledge out there for

realtime audio scheduler solutions which we could tap into. Should we patch csound (I am thinking here InOut/rtalsa.c) to include a SCHED_RR call too?
Thanks.

comment:34 follow-up: Changed 6 years ago by dsaxena

I tried running the test-midi package and got a segfault:

[olpc@xo-0C-FF-AA test-midi]$ csound gm.csd -F midifile -m msg-level
time resolution is 2.321 ns
0dBFS level = 32768.0
Csound version 5.08.91 beta (float samples) Jul 28 2008
libsndfile-1.0.17
Csound tidy up: Segmentation fault

comment:35 in reply to: ↑ 33 Changed 6 years ago by dsaxena

Replying to veplaini:

This sounds like an excellent idea. There is a wealth of knowledge out there for

realtime audio scheduler solutions which we could tap into. Should we patch csound (I am thinking here InOut/rtalsa.c) to include a SCHED_RR call too?

I'm not sure about forcing the policy at build time in csound. I'm not really an audio person, but are there cases where this would not be desired. It alos looks like it already accepts a --sched flag...how is this handled.

comment:36 in reply to: ↑ 34 ; follow-up: Changed 6 years ago by veplaini

Replying to dsaxena:

I tried running the test-midi package and got a segfault:

[olpc@xo-0C-FF-AA test-midi]$ csound gm.csd -F midifile -m msg-level
time resolution is 2.321 ns
0dBFS level = 32768.0
Csound version 5.08.91 beta (float samples) Jul 28 2008
libsndfile-1.0.17
Csound tidy up: Segmentation fault

Try

csound gm.csd -F rain.mid -m 6

comment:37 in reply to: ↑ 36 Changed 6 years ago by dsaxena

Replying to veplaini:

csound gm.csd -F rain.mid -m 6

Thanks.

/me finds paper bag.

comment:38 Changed 6 years ago by gregorio

  • Keywords blocks:8.2.0 added; blocks?:8.2.0 removed

comment:39 Changed 6 years ago by Andrew Burgess

Sorry to come into this so late but are you aware of the latencytop program? It's designed for finding exactly the type of kernel problem that you've found 'the hard way' (absolutely no offense). It sure pointed the finger at a badly written driver on my home system.

http://http://www.latencytop.org/

It works without patches in modern kernels though I seem to recall it needs certain config options.

HTH

comment:40 Changed 6 years ago by Andrew Burgess

dammit, that should be http://www.latencytop.org/
(and I even previewed!)

comment:41 follow-up: Changed 6 years ago by veplaini

With joyride-2273 not much improvement in the midi-tests; however
using the --sched option for csound really makes a difference: only
1 xrun in rain.mid where there were loads.

However --sched only works under su/sudo, as it can't lock memory
as user OLPC. This could be worked around so that the right permissions
are given to an 'audio' group, perhaps?

This might not be necessary if we have a more responsive kernel. I'm
really happy you are looking into this: previously I had the impression
RT preemption for olpc was a no-go area. I think this is crucial for
a laptop that will use media heavily.

Linux is a great system for audio (probably the best), when the right
kernel tuning is applied. AFAIK, the best kernel so far for audio is
2.6.24[.7-rt17]. There is also a list specialised in discussing these
issues: http://lists.linuxaudio.org/listinfo/linux-audio-tuning.

VL

comment:42 Changed 6 years ago by veplaini

However --sched is only used by the csound command; activities use
the API, so --sched is ignored. So I might look into allowing a
new option to set scheduler priority as part of the rtalsa module,
which should improve performance. This won't affect TamTam, as it
does not use rtalsa (but as you said here
there is scheduler priority code there).

We might still need the right permissions to use it.

comment:43 in reply to: ↑ 41 Changed 6 years ago by dsaxena

Replying to veplaini:

With joyride-2273 not much improvement in the midi-tests; however
using the --sched option for csound really makes a difference: only
1 xrun in rain.mid where there were loads.

However --sched only works under su/sudo, as it can't lock memory
as user OLPC. This could be worked around so that the right permissions
are given to an 'audio' group, perhaps?

--sched currently both sets the priority and locks memory. Is it possible
to make these into two separate ops, at least for testing purposes? (we
may want to take this discussion offline).

This might not be necessary if we have a more responsive kernel. I'm
really happy you are looking into this: previously I had the impression
RT preemption for olpc was a no-go area. I think this is crucial for
a laptop that will use media heavily.

Note that we're not going to even consider -rt for the 8.2 update at this
point as it would invalidate all our testing so far. Have you done your
csound testing with the kernel patch in comment 32?

Thanks for the list, I'll go join.

comment:44 Changed 6 years ago by dsaxena

Daniel, can you test with the kernel in comment 32? I'd like your OK before I commit this since you originally found this bug. Also, can you look at the TamTam patch and commit that?

comment:45 Changed 6 years ago by veplaini

I can't test kernel patches because I am not building kernels
here. But I will do as soon as it comes out in the joyride.

I will look into separating the two, or at least just setting the
priority with --sched as a test. But this code has to be moved
to the output module (librtalsa.so), because we hardly use the
command line. Most of our use of csound is through the API.

That is OK about 8.2, but if you can consider the possibility of RT patches
for the future, that'd be great.

comment:46 in reply to: ↑ 31 Changed 6 years ago by dsaxena

Replying to dsaxena:

I did some Googling around of "ALSA underrun scheduler" and found https://bugs.launchpad.net/ubuntu/+source/pulseaudio/+bug/190754 and https://bugs.launchpad.net/ubuntu/+source/linux/+bug/188226. By enabling group scheduling and forcing all applications to be in one group by disabling per-UID scheduling groups, I can run TamTamMini for 10+ minutes with an average of < 1 underrun per second.

I meant "< 1 underrun per minute", which is a large difference.

comment:47 follow-up: Changed 6 years ago by dsd

I patched tamtam, changed to the test kernel and ran my unscientific patch. underruns-per-minute reduced from 92 to 76. Then I detatched the screen session and close the terminal, and observed 16 underruns per minute (vs 8 from my previous measurements). 2.6.22 still seems to win hands-down on these tests.

However I then launched TamTamMini from sugar, started a drumbeat and played a continuous tone on an instrument for a minute or so. Didn't hear any imperfections. So perhaps we have reached the stages of being "good enough"

comment:48 in reply to: ↑ 47 Changed 6 years ago by dsaxena

Replying to dsd:

I patched tamtam, changed to the test kernel and ran my unscientific patch. underruns-per-minute reduced from 92 to 76. Then I detatched the screen session and close the terminal, and observed 16 underruns per minute (vs 8 from my previous measurements). 2.6.22 still seems to win hands-down on these tests.

However I then launched TamTamMini from sugar, started a drumbeat and played a continuous tone on an instrument for a minute or so. Didn't hear any imperfections. So perhaps we have reached the stages of being "good enough"

It comes down to what are the use cases we care about in deployed systems. Kids will launch TamTamMini directly from Sugar, not from the terminal and I'm guessing they won't be multitasking a lot and running multiple activities at once and have TamTamMini playing audio in the background while doing something else (I could be wrong).

I'll go ahead and commit the config update for 8.2.

I did try to run latencytop on the XO today but I think there's a version missmatch between
what's in our kernel and the package in the Fedora repositories so need to custom build the old version from scratch and try some more.

comment:49 follow-up: Changed 6 years ago by dsaxena

Kernel config has been committed so we're not running with the group scheduler.

Daniel, can you commit the TamTam change?

comment:50 Changed 6 years ago by veplaini

I have been testing with the scheduler priority code in the alsa
module in Csound and it makes a good difference. However, I can only
test it in the terminal, because from a sugar activity the set_scheduler
function fails. I believe this is a permissions issue. I have set
the /etc/security/limits.conf and this works for user olpc at
the terminal, but activities are user '10002' and this does not
work for them.

Daniel, can you confirm the set_scheduler function is definitely
working and setting the scheduler policy? I would expect it not
to work for the same reasons as above. You can check its return
value for errors.

Any further suggestions?

comment:51 in reply to: ↑ 49 Changed 6 years ago by veplaini

Replying to dsaxena:

Kernel config has been committed so we're not running with the group scheduler.

Daniel, can you commit the TamTam change?

What is the 'group scheduler' again? And is this already in the
joyride?

comment:52 follow-up: Changed 6 years ago by dsd

I can confirm that the setscheduler patch doesn't work:
!!!Scheduler set to Round Robin with priority 99 FAILED!!!

comment:53 in reply to: ↑ 52 ; follow-up: Changed 6 years ago by dsaxena

  • Cc mstone added

Replying to dsd:

I can confirm that the setscheduler patch doesn't work:
!!!Scheduler set to Round Robin with priority 99 FAILED!!!

Hmm, so my better behavior with the scheduler priority change was just coincidental. We are still seeing better behaviour with the different kernel scheduler, but it'd be nice if we can use the RR scheduling.

Could we have csound run setuid user and see if just having it run RR helps. Would this break our security model?

comment:54 in reply to: ↑ 53 Changed 6 years ago by dsaxena

Replying to dsaxena:

Could we have csound run setuid user and see if just having it run RR helps. Would this break our security model?

meant setuid olpc.

comment:55 follow-up: Changed 6 years ago by mstone

Just tell me what syscalls you actually want it to be able to call! :)

comment:56 in reply to: ↑ 55 Changed 6 years ago by dsaxena

Replying to mstone:

Just tell me what syscalls you actually want it to be able to call! :)

sys_sched_setscheduler()

comment:57 follow-up: Changed 6 years ago by veplaini

Have you read my posts to the devel list? I have seen significant improvement
with the scheduler code. I have a patch already committed to CVS to do this
in the rtalsa module, but because of the recent problems with Koji I was not
able to build it. I will be able to do it on Monday evening, then you will be
able to test csound from the terminal (by setting /etc/security/limits.conf
correctly for user olpc).

It's possible to test this code + memory lock by using the --sched option
in the command line csound (as root, or as olpc if you edit limits.conf).
The code in my patch does not lock memory.

comment:58 Changed 6 years ago by veplaini

  • Cc veplaini added

comment:59 in reply to: ↑ 57 ; follow-up: Changed 6 years ago by dsaxena

Replying to veplaini:

Have you read my posts to the devel list? I have seen significant improvement
with the scheduler code. I have a patch already committed to CVS to do this
in the rtalsa module, but because of the recent problems with Koji I was not
able to build it. I will be able to do it on Monday evening, then you will be
able to test csound from the terminal (by setting /etc/security/limits.conf
correctly for user olpc).

It's possible to test this code + memory lock by using the --sched option
in the command line csound (as root, or as olpc if you edit limits.conf).
The code in my patch does not lock memory.

Yes I saw this. If calling "csound --sched" from Clooper helps, then we just need to work with mstone to figure out the appropriate way to enable the scheduler call w/o compromising our security model.

comment:60 in reply to: ↑ 59 Changed 6 years ago by veplaini

Replying to dsaxena:

Replying to veplaini:

Have you read my posts to the devel list? I have seen significant improvement
with the scheduler code. I have a patch already committed to CVS to do this
in the rtalsa module, but because of the recent problems with Koji I was not
able to build it. I will be able to do it on Monday evening, then you will be
able to test csound from the terminal (by setting /etc/security/limits.conf
correctly for user olpc).

It's possible to test this code + memory lock by using the --sched option
in the command line csound (as root, or as olpc if you edit limits.conf).
The code in my patch does not lock memory.

Yes I saw this. If calling "csound --sched" from Clooper helps, then we just need to work with mstone to figure out the appropriate way to enable the scheduler call w/o compromising our security model.

No, this won't work for CLooper. The --sched code is in the csound CLI frontend and not in the Csound library. The solution is to reinstall the scheduler code in CLooper, if we are allowed to.
The code I have added to Csound will not affect CLooper, because it is in the librtalsa.so plugin, which is not loaded by CLooper. However it should help all other activities using Csound (TamTam is not the only one).
We'll se what mstone has to say and hopefully we can test this next week.

comment:61 Changed 6 years ago by veplaini

I remembered this paper by Lee Revell at lac2006 which has
some relevant information re: future improvements to the
kernel for audio. It might be a useful read. Lee is a nice
guy and we can ask him questions directly as well if we need to:

http://lac.zkm.de/2006/papers/lac2006_lee_revell.pdf

comment:62 follow-up: Changed 6 years ago by cscott

Could we also verify that the root cause is not sugar-performance-related? There are a number of "Sugar takes up 100% cpu" type bugs floating around, and I'd like to be certain that this is not a dup of those.

What's our current status with unmodified joyride and unmodified tam tam? How bad are the underruns now?

comment:63 Changed 6 years ago by dsd

Downgrading to 2.6.22 improves performance in my test (although as noted above, my test is not of usual circumstances), so I am not quick to blame sugar vs kernel.

joyride currently is quite good here, maybe not as good as update1, but good enough to go forward. The scheduler fix may help more, once rainbow lets us do that. Also this bug should not be closed until the hal fix has made it somewhere proper, right now its in my public_rpms. I contacted hal upstream, no real response, I filed a bug on redhat bugzilla, no response, I'll give it another couple of days and then request an OLPC-3 fork from the cvs admins.

comment:64 Changed 6 years ago by cjb

Could we also verify that the root cause is not sugar-performance-related?

I was worried about it being general-performance-related, so I compared tinderbox runs of pystone between update.1 and joyride builds. Looks like they're identical, so that's something.

comment:65 in reply to: ↑ 62 ; follow-up: Changed 6 years ago by veplaini

Replying to cscott:

Could we also verify that the root cause is not sugar-performance-related? There are a number of "Sugar takes up 100% cpu" type bugs floating around, and I'd like to be certain that this is not a dup of those.

I am quite positive this is a kernel issue, unrelated to sugar. Graphics will play
a part in generating xruns, but only because the kernel cannot manage preemption
properly, IMHO.

comment:66 in reply to: ↑ 65 Changed 6 years ago by veplaini

Replying to veplaini:

Replying to cscott:

Could we also verify that the root cause is not sugar-performance-related? There are a number of "Sugar takes up 100% cpu" type bugs floating around, and I'd like to be certain that this is not a dup of those.

I am quite positive this is a kernel issue, unrelated to sugar. Graphics will play
a part in generating xruns, but only because the kernel cannot manage preemption
properly, IMHO.

Also, it seems to be a well-known fact in audio developer's circles that there was a regression between 2.6.24 and 2.6.25 in terms of RT performance.

comment:67 Changed 6 years ago by cscott

dsd: please start the process of forking hal upstream; we need to start making our stable branch, and I'd prefer to carry as much as possible in koji.

Since audio performance is now acceptable, the rest will likely be deferred to 9.1, especially if there's a chance we'll get some fixes from the upstream kernel when we move to 2.6.27 (veplaini: have the regressions been fixed in newer kernels yet?).

comment:69 Changed 6 years ago by veplaini

I just want to point out that performance is still not to the previous release standard. My tests with a MIDI activity show that whilst with the previous release
I got no xruns, with 2301 I am getting about 2-3 a minute. By increasing the
buffer size I can remove these, at the expense of interactivity (not a problem
here, but possibly elsewhere). This is of course much better than before the
changes to the kernel.

If there is no time to test changes to Rainbow to allow scheduler priority changing,
there is nothing more we can do. We should then defer further work to the next release, when we might try more agressive preemption in the kernel with RT patches.

comment:70 follow-up: Changed 6 years ago by dsd

If we fix rainbow, does that definitely increase performance up to previous standards? Unless I'm missing something, setting SCHED_RR is just an idea for improvement, not something that we have actually measured?

comment:71 in reply to: ↑ 70 Changed 6 years ago by veplaini

Replying to dsd:

If we fix rainbow, does that definitely increase performance up to previous standards? Unless I'm missing something, setting SCHED_RR is just an idea for improvement, not something that we have actually measured?

well, I have tested it at the terminal and it definitely makes a difference. I expect the same to be true when we can do it in the activities (but we need to
test).

comment:72 Changed 6 years ago by veplaini

From what I can see, what this code helps with is to give audio a higher
priority over the graphics. What I noticed was that a lot of the times the
xruns would occur on pointer movement or similar thing. It does not help
much if you comparing performances where there is no interaction.

comment:73 Changed 6 years ago by kimquirk

  • Keywords 8.2.0:? blocks:8.2.0 removed

In the current builds this is no longer a blocker; though there is more work that can be done. Also needs appropriate packaging.

comment:74 Changed 6 years ago by dsd

  • Owner changed from dilinger to ApprovalForUpdate

Please tag hal-0.5.11-2.olpc3.1 and hal-info-20080607-1.olpc3.1 for 8.2 (dist-olpc3). These fixed packages are already included in the build but are coming from my public_rpms. I have only now managed to get this package forked upstream.

I will file a new bug to track the setscheduler vs rainbow issue for hopeful inclusion in 8.2.1.

comment:75 Changed 6 years ago by dsd

|TestCase|

Fully charge XO battery. Leave running on AC power. Watch kernel logs, confirm that kernel is not spitting out EC messages every 30 seconds.

comment:76 Changed 6 years ago by mstone

  • Action Needed changed from diagnose to add to build

comment:77 Changed 6 years ago by cscott

  • Action Needed changed from add to build to test in release

hal packages added to 8.2; should be in 758 and following.

comment:78 Changed 6 years ago by joe

  • Cc gregorio kimquirk added
  • Resolution set to fixed
  • Status changed from new to closed

Tested both in 8.2-759 and 8.2-760 - couldn't find anu imperfections with my (untrained) ears. Closing this ticket.

Note: See TracTickets for help on using tickets.