Opened 6 years ago

Closed 6 years ago

#8098 closed defect (duplicate)

Mystery task hogs 50% of CPU.

Reported by: thomaswamm Owned by: morgs
Priority: normal Milestone: 8.2.0 (was Update.2)
Component: sugar Version: Development build as of this date
Keywords: Cc: garycmartin
Blocked By: Blocking:
Deployments affected: Action Needed: no action
Verified: no

Description

It takes more time to report a bug than to find one.

In joyride-2301, I was trying unsuccessfully to reproduce someone else's bug (maybe #8052), but gave up. Afterward I noticed sluggish response for no good reason. So in Terminal I ran
$ top
and saw a persistent python task hogging 50% of the CPU and 20% memory, for a long time, for no good reason. Even with no Sugar Activities running, 'top' in the ctrl-alt-F1 console still showed the mystery python task hogging CPU. Can't remember what I did to kill it; probably rebooted. Screenshot of 'top' attached.

Attachments (11)

tmpbQP7Gz.png (114.9 KB) - added by thomaswamm 6 years ago.
Screenshot - top shows mystery python task hogging CPU
tmpeXUGPn.png (134.4 KB) - added by thomaswamm 6 years ago.
Screenshot - top c shows python /usr/bin/sugar-shell hogging CPU
tmpvOs4mC.pdf (4.1 KB) - added by thomaswamm 6 years ago.
Trac thought this was spam. Output from pstree -al . Rename from .pdf to .log or .txt
tmpehxxkU.tar (200.0 KB) - added by thomaswamm 6 years ago.
/var/log/messages while the hog sugar-shell is running
tmpISo5pd.txt (488.9 KB) - added by thomaswamm 6 years ago.
/var/log/messages while the hog sugar-shell is running
tmpwUw6fF.tar (200.0 KB) - added by thomaswamm 6 years ago.
sugar-logs tarball of directory tree, I hope.
launch.patch (565 bytes) - added by marco 6 years ago.
tmpFMUB1L.png (120.4 KB) - added by thomaswamm 6 years ago.
Screenshot - top and strace sugar-shell hog
logs.CSN7500144C.2008-09-15.01-23-40.tar.bz2 (195.0 KB) - added by thomaswamm 6 years ago.
logs captured while sugar-shell was persistently hogging over 15% of CPU. Build 8.2-760.
strace.log (564.5 KB) - added by thomaswamm 6 years ago.
strace while sugar-shell was hogging over 15% of CPU.
strace-normal-sugar.log (8.3 KB) - added by thomaswamm 6 years ago.
A brief strace of sugar-shell when everything is normal (no CPU hog), for comparison.

Download all attachments as: .zip

Change History (38)

comment:1 Changed 6 years ago by tomeu

In top, you can get more meaningful names by pressing 'c'.

Changed 6 years ago by thomaswamm

Screenshot - top shows mystery python task hogging CPU

comment:2 Changed 6 years ago by garycmartin

  • Cc garycmartin added

My guess is that this was the sugar-shell (a python process) doing 'something'. Usually X and sugar-shell are the processes twitching about at the top of top with ~1% CPU when an XO is otherwise idle. Next time either press c in top, as suggested by tomeu, or take a note of the PID, then use ps -wfp <PID> to see the full command. Also pstree is v.good for looking at what process are started by what – if you want to see the full, un-truncated commands, use pstree -al

comment:3 Changed 6 years ago by thomaswamm

Thanks for the suggestions; I'm learning the tools.

Anyway it happened again, after a couple hours of only using Browse in joyride-2301 to surf Trac tickets. I can still use my XO, but it is noticeably sluggish.

top c says the hog is python /usr/bin/sugar-shell (Screenshot to be attached)

pstree -al produced a Greek manuscript, to be attached soon.

What is it doing, why, what started it, why doesn't it stop, and is this a bug? Could it be a virus or other invasion? The wifi LED seems unusually busy for no good reason; I'm just typing into Trac. After I submit this, I will experiment to kill the beast.

Changed 6 years ago by thomaswamm

Screenshot - top c shows python /usr/bin/sugar-shell hogging CPU

Changed 6 years ago by thomaswamm

Trac thought this was spam. Output from pstree -al . Rename from .pdf to .log or .txt

comment:4 follow-up: Changed 6 years ago by marco

Are you connected to a jabber server when it happens?

Debug logs would be very useful here.

http://wiki.laptop.org/go/Attaching_Sugar_Logs_to_Tickets

Cerebro seem to be running!?

comment:5 in reply to: ↑ 4 Changed 6 years ago by thomaswamm

Replying to marco:

Are you connected to a jabber server when it happens?

Probably not, but I don't know enough to be sure. Control Panel/Radio is set up for xochat.org which has not worked for me for weeks. There are no XO's in my Neighbourhood.

Debug logs would be very useful here.

If I knew how, maybe I could tarball the logs directory and attach the whole thing. Will try.
The beast still lives in my XO, stealing my CPU cycles.

comment:6 Changed 6 years ago by marco

You could start by uploading this one if it makes it easier:

/home/olpc/.sugar/default/logs/shell.log

Changed 6 years ago by thomaswamm

/var/log/messages while the hog sugar-shell is running

Changed 6 years ago by thomaswamm

/var/log/messages while the hog sugar-shell is running

Changed 6 years ago by thomaswamm

sugar-logs tarball of directory tree, I hope.

comment:7 Changed 6 years ago by thomaswamm

oops, attachment tmpehxxkU.tar is NOT /var/log/messages, but is probably sugar logs tarball (redundant copy)

In process of prepping to upload the logs, my XO keyboard went dead (but touchpad and mouse keys okay), so I had to Shutdown/reboot. Now the hog is gone or idle. XO is no longer sluggish. Top shows another process labelled python /usr/bin/sugar-shell using little CPU but 17% memory.

Hope you guys can use this info. The hog is a nuisance.

comment:8 follow-up: Changed 6 years ago by marco

Does this happen only after you started at least one activity? Does rebooting always solve it?

My only suspect is that the launcher icon might keep spinning. I cannot reproduce it though.

comment:9 in reply to: ↑ 8 ; follow-up: Changed 6 years ago by thomaswamm

Replying to marco:

Does this happen only after you started at least one activity? Does rebooting always solve it?

My only suspect is that the launcher icon might keep spinning. I cannot reproduce it though.

Yes, I have only observed the hog when using an XO activity, always Browse-95, plus sometimes also Terminal, Journal, Pippy, Measure, Logs, Firefox-4.xo. I avoid running more than 3 activities. Rebooting the entire XO always stops the problem. If I see the hog again, I shall try ctrl-alt-erase to reboot only sugar.

As an experiment, I cold-booted my XO (to Home view, with only Journal started in background), then I went away for 5 hours. No appearance of the 50% hog, though python /usr/bin/sugar-shell is listed in top, using under 1% CPU and 17% mem. When I kill <PID> that process, X-server and sugar die, then restart.

I have been running Browse a few hours now, with no hog. I cannot yet reproduce the hog bug on demand.

So I suppose sugar-shell is the sugar GUI. Maybe the hog correlates with particular sugar features such as clipboard items or keyboard cheat codes or keyboarding errors, or persistent pallettes. Spinning code? To be tested...

comment:10 in reply to: ↑ 9 ; follow-up: Changed 6 years ago by marco

Replying to thomaswamm:

Yes, I have only observed the hog when using an XO activity, always Browse-95, plus sometimes also Terminal, Journal, Pippy, Measure, Logs, Firefox-4.xo. I avoid running more than 3 activities.

Shooting in the dark but... can you try to launch Firefox (perhaps a few times) and see if it causes the cpu hog?

So I suppose sugar-shell is the sugar GUI. Maybe the hog correlates with particular sugar features such as clipboard items or keyboard cheat codes or keyboarding errors, or persistent pallettes. Spinning code? To be tested...

My main suspect is the spinning icon that you get when you start an activity. I'm going to attach a patch to disable that animation. If you could apply it and see if it fixes the problem it would be very useful. As root:

cd /usr/share/sugar/shell/view/
patch -p0 <path_to_the_patch

Changed 6 years ago by marco

comment:11 Changed 6 years ago by tomeu

Thomas, when that happens again, could you please attach to the process with strace and attach the resulting log to this ticket?

From terminal:

strace -p <pid of sugar-shell> &> /home/olpc/strace.log

Thanks!

comment:12 Changed 6 years ago by mstone

thomaswamm -- any news?

comment:13 Changed 6 years ago by thomaswamm

The hog bug has not happened again, though I occasionally get other freezes or crashes that make the hog look pleasant in comparison. I can't try patches or logging until it re-appears or until I can reproduce it on demand.

comment:14 in reply to: ↑ 10 Changed 6 years ago by thomaswamm

Replying to marco:

Shooting in the dark but... can you try to launch Firefox (perhaps a few times) and see if it causes the cpu hog?

I have tried Firefox occasionally. No 50% persistent hog. But more mysterious total freezes and crashes. Browse seems more reliable and less troublesome.

comment:15 Changed 6 years ago by thomaswamm

  • Component changed from not assigned to performance
  • Owner set to cjb
  • Version changed from not specified to Development build as of this date

No re-appearances of the 50% CPU hog. I only ever saw it twice, in joyride-2301. I have since been testing 8.2-757 and now 8.2-759, which still have other bugs in other Trac tickets.

comment:16 Changed 6 years ago by thomaswamm

I am testing build 8.2-760 now, and see some slight sluggishness, and top says sugar-shell is persistently hogging over 15% of the CPU (for maybe the past hour). I have been running Browse-96, sometimes Terminal-17, sometimes console, and previously Firefox-6. At one time there was a ghost clipboard object leftover in the left frame, but it went away after more copy/paste operations. Logs to follow. I ran strace for maybe a minute then did ctrl-c to stop it.

Changed 6 years ago by thomaswamm

Screenshot - top and strace sugar-shell hog

Changed 6 years ago by thomaswamm

logs captured while sugar-shell was persistently hogging over 15% of CPU. Build 8.2-760.

Changed 6 years ago by thomaswamm

strace while sugar-shell was hogging over 15% of CPU.

comment:17 Changed 6 years ago by thomaswamm

After ctrl-alt-erase (warm reboot of sugar), sugar-shell is taking less than 3% of CPU.

Perhaps after you wizards review the logs, you can make more suggestions about possible ways to reproduce the hog.

comment:18 Changed 6 years ago by marco

  • Action Needed changed from reproduce to diagnose
  • Component changed from performance to sugar
  • Owner changed from cjb to marco

We don't print timestamps for the traces so I can't say for sure. But it seem like we are repeatedly trying to position a palette and failing to do so. It should not be too hard to figure out what is broken in the logic there.

comment:19 follow-up: Changed 6 years ago by marco

It seem like you tried to add a bzip file to the clipboard btw. Was that before or after the hog started?

comment:20 in reply to: ↑ 19 Changed 6 years ago by thomaswamm

Replying to marco:

It seem like you tried to add a bzip file to the clipboard btw. Was that before or after the hog started?

That was long after the hog started. The bzip was the olpc-log tarball attached above. Sometimes I upload using Browse (going thru clipboard and Journal), othertimes with Firefox. This time I was preparing both methods, but ended up using Firefox.

Changed 6 years ago by thomaswamm

A brief strace of sugar-shell when everything is normal (no CPU hog), for comparison.

comment:21 Changed 6 years ago by marco

  • Owner changed from marco to morgs

comment:22 Changed 6 years ago by thomaswamm

  • Blocked By 8592 added

comment:23 Changed 6 years ago by thomaswamm

#8592 might be related, but the first screenshot attached here does not show simultaneous AP icon and Mesh icon in frame.

comment:24 Changed 6 years ago by thomaswamm

Very probably the bug is found in #8711, with an easy fix!

comment:25 Changed 6 years ago by cscott

  • Action Needed changed from diagnose to test in build

Please test in joyride-2493 or later, or stable build 766 or later.

comment:26 Changed 6 years ago by thomaswamm

  • Blocked By 8592 removed

(In #8592) #8098 is unrelated to this bug.

comment:27 Changed 6 years ago by thomaswamm

  • Action Needed changed from test in build to no action
  • Resolution set to duplicate
  • Status changed from new to closed

I am satisfied that the mystery is solved by #8711, so will close this ticket as a duplicate. I am testing 8.2-767 and see no mystery CPU hogs.

Note: See TracTickets for help on using tickets.