Opened 5 years ago

Closed 5 years ago

Last modified 5 years ago

#9559 closed defect (fixed)

os33 - Journal misbehaving

Reported by: mikus Owned by: marco
Priority: normal Milestone:
Component: sugar Version: Development build as of this date
Keywords: Cc: mikus, sayamindu
Blocked By: Blocking:
Deployments affected: Action Needed: review
Verified: no

Description

Strange entries in Journal View. Could not get "detail view" to be shown for them. Could not delete them.

When I deleted some other entries in the Journal, these strange entries were removed one by one. But it seemed to me that I still could *not* delete those Journal entries which had moved into the positions formerly occupied by the strange entries.

Attachments (4)

Journal.png (74.1 KB) - added by mikus 5 years ago.
strange.zip (150.1 KB) - added by mikus 5 years ago.
os42 - collection of potentiallly relevant files
shell.log (9.0 KB) - added by dsd 5 years ago.
shell log
datastore.tar.bz2 (522.4 KB) - added by dsd 5 years ago.
corrupted datastore

Download all attachments as: .zip

Change History (30)

Changed 5 years ago by mikus

comment:1 Changed 5 years ago by dsd

  • Component changed from not assigned to sugar
  • Owner set to marco

comment:2 follow-up: Changed 5 years ago by sayamindu

What was the language set to when this happened? This bit of the code uses plural forms in gettext which is sometimes incorrectly translated.

comment:3 in reply to: ↑ 2 Changed 5 years ago by mikus

Replying to sayamindu:

What was the language set to when this happened?

I never changed the language. It was whatever os33 had specified it to be.

comment:4 follow-up: Changed 5 years ago by Quozl

  • Action Needed changed from never set to reproduce

Triage.

Was this behaviour dependent on any external media?

comment:5 Changed 5 years ago by dsd

  • Cc sayamindu added

comment:6 in reply to: ↑ 4 Changed 5 years ago by mikus

Replying to Quozl:

Was this behaviour dependent on any external media?

This was seven build versions ago - I no longer remember what took place before I saw the strange screen. [I do not keep a notebook with detail annotations. Basically, I go ahead and use the machine - and if something unusual happens, it becomes a coin toss as to whether to mention that happening, or not.]

It is not clear to me what your question asks. My normal machine setup includes a "permanent" SD card in the OLPC external slot, plus an USB hub plugged in, which in turn has an USB keyboard, an USB trackball, and an USB ethernet adapter plugged in there. But that is the machine setup I use *every* time I use the machine (sometimes I have to unplug some USB stuff in order to overcome OFW recalcitrance at performing a boot; but I plug things back in once the kernel gets loaded) -- so my "external media" remain constant, whether or not the machine is behaving normally, or happens to do something unexpected.

comment:7 follow-up: Changed 5 years ago by Quozl

  • Action Needed changed from reproduce to diagnose

Your screen shot shows external media was mounted, and so the contents of the journal on that media would be critical to the problem. It was input data to the journal reading code.

I've not seen this corruption in the absence of your media here, so I think it is caused by the contents of your media in combination with the journal reading code.

We appear to have lost the opportunity to diagnose this problem further because it was seven build versions ago and you don't remember.

It may relate to #9658 "New journal cannot read old stored Journal Entries", in which case you might be able to verify this by using the media with build 802, saving a journal entry, and then trying to use it with a later build.

If you have a backup of your media at the time the problem happened, you could restore from it and reproduce.

I don't think it is practical to reproduce this given the available data, so I'll move this on to diagnose and hope the Sugar experts can help fix it.

comment:8 in reply to: ↑ 7 ; follow-up: Changed 5 years ago by mikus

Replying to Quozl:

Your screen shot shows external media was mounted, and so the contents of the journal on that media would be critical to the problem. It was input data to the journal reading code.

I doubt whether me having external media mounted was the cause of the unexpected screen content. [I believe that in Sugar 0.84, information is not kept about the content of external media - rather, that media is scanned anew each time it is accessed.]

I have had a "permanent" SD card in my OLPC from the first day I got it (in 2007). As I got more XOs, each of them got its own "permanent" SD card. If the presence of these external media were the causative factor, I would have noticed more such incidents of Journal misbehavior -- but I have not.

In particular, the only times I have seen the presence of the external media affect the content of Journal View is if I explicitly click on the external media icon at the bottom of Journal (instead of using Journal View to show the content of the "internal" Journal). To the best of my recollection, I had not done so when I encountered the #9559 misbehavior.

It may relate to #9658 "New journal cannot read old stored Journal Entries", in which case you might be able to verify this by using the media with build 802, saving a journal entry, and then trying to use it with a later build.

Not the situation here. The "internal" Journal is "reset" from scratch every time I install a new build. [Although my build 802 machines also have a "permanent" SD card, I never swap cards between machines.]


What you've said leads me to reconsider whether those 'olpc_episode4.ogg' entries in the Journal are a record of me on the day before having accessed content from the "permanent" SD card. It's possible -- it was cumbersome in Sugar 0.82 to access external media content, and I might have been checking out how such access worked in Sugar 0.84. I don't remember.

comment:9 in reply to: ↑ 8 Changed 5 years ago by Quozl

Replying to mikus:

I doubt whether me having external media mounted was the cause of the unexpected screen content. [I believe that in Sugar 0.84, information is not kept about the content of external media - rather, that media is scanned anew each time it is accessed.]

No worries. You probably know more about it than I do. I'll leave it to the Sugar experts.

What you've said leads me to reconsider whether those 'olpc_episode4.ogg' entries in the Journal are a record of me on the day before having accessed content from the "permanent" SD card.

Yes, it seems strange that these would appear given that you had reloaded the operating system and not directed it to read from the SD card. We don't ship that file name on the current builds.

comment:10 Changed 5 years ago by mikus

Got the strange entries in Journal again, with os42.

The preceding events:
1) The XO-1.5 locked up completely while I was using a webkit-based browser. [This might/might not have left Journal in a strange state.] Had to push power button to reboot the system. [I do not know what caused that lock-up.]
2) When Home View came up, I did not like it. [Activities on mySD card were not being shown.] I entered <ctl-alt-erase> to restart sugar.
3) When Home view came up again (this time with the Activities that were on the SD card), I clicked on the Journal icon (below the "me-icon" in the screen center) - I intended to launch Terminal from Journal View.
4) But in Journal View I noticed two strange entries. I pressed <alt-1> to take a snapshot - but I think none actually was saved. [I do not know if it was me pressing the keyboard that failed, or if it was the software that was supposed to respond to <alt-1> that failed.] I then went to a text console (<ctl-alt-2>) and captured some files from /var/log and ~/.sugar/logs.
5) I came back to the sugar session - by now there was only one strange entry shown, but the Speak entry [that was in Journal from before the system lock-up (step (1))] now showed twice. Again I pressed <alt-1>, and this time a snapshot was taken.
6) I captured the .sugar/database tree. [Note that this was *after* the capture of the screen snapshot into Journal.]

Changed 5 years ago by mikus

os42 - collection of potentiallly relevant files

comment:11 Changed 5 years ago by sayamindu

The shell log seems to suggest a corrupt datastore

  File "/usr/lib/python2.6/site-packages/jarabe/journal/journalactivity.py", line 265, in _focus_in_event_cb
    self._list_view.update_dates()
  File "/usr/lib/python2.6/site-packages/jarabe/journal/listview.py", line 462, in update_dates
    entry.update_date()
  File "/usr/lib/python2.6/site-packages/jarabe/journal/collapsedentry.py", line 276, in update_date
    self.date.props.text = misc.get_date(self._metadata)
  File "/usr/lib/python2.6/site-packages/jarabe/journal/misc.py", line 87, in get_date
    timestamp = float(metadata['timestamp'])
ValueError: invalid literal for float(): text/plain

I'll try to reproduce this by messing up the datastore manually and checking what happens.

comment:12 Changed 5 years ago by sayamindu

Can reproduce by manually changing metadata/timestamp in one of the entries to text instead of an int.

Trying to figure out now if this can be handled gracefully.

comment:13 follow-up: Changed 5 years ago by sayamindu

In mikus's attached datastore, ./27/27d07415-a40f-4422-9fac-64bb863daab6/metadata/timestamp does not contain a number (timestamp), but the string 'text/plain'

comment:14 in reply to: ↑ 13 Changed 5 years ago by Quozl

Replying to sayamindu:

In mikus's attached datastore, ./27/27d07415-a40f-4422-9fac-64bb863daab6/metadata/timestamp does not contain a number (timestamp), but the string 'text/plain'

That is quite consistent with his report of powering off the unit, in light of #9612. Sugar must assume file corruption may occur, and recover gracefully. #9455 may reduce the probability of this happening again, but really, Sugar should be resilient to file corruption, because #9455 isn't the only source of potential corruption.

comment:15 Changed 5 years ago by sayamindu

Hmm, this is a tricky issue. While the knee-jerk reaction would be to simply not display the journal entries that are damaged, some of the entries might actually still work (for example, clicking on the "blank" entry in my synthetically damaged journal did start the activity correctly). It would be almost impossible to figure out the level of "damage" to the underlying datastore/filesystem from Sugar level, so as a result, it would be impossible to figure out which damaged entries to show and which to hide.
I'll try to start a discussion on upstream sugar list about this.

comment:16 Changed 5 years ago by sayamindu

  • Action Needed changed from diagnose to add to build

I have tried to make the timestamp getting part, as well as the 'keep' metadata flag more robust, as they were the two fields that seemed to get affected.
I have uploaded a test RPM to public_rpms with a patch to do this. Please test with the next build if you can reproduce this. I have suggested a slightly more invasive patch upstream at http://dev.sugarlabs.org/ticket/1561

comment:17 Changed 5 years ago by sayamindu

  • Action Needed changed from add to build to test in build

Please test os43 to see if this occurs again. If it occurs, please attach datastore.zip and shell.log

comment:18 Changed 5 years ago by Quozl

  • Resolution set to fixed
  • Status changed from new to closed

I've tested, it didn't happen. I'll close this now. Mikus, if you see it again with build os43 or later, please reopen or raise a new ticket, thanks.

comment:19 Changed 5 years ago by dsd

  • Resolution fixed deleted
  • Status changed from closed to reopened

reopening until we have this fixed properly in a build (which means we'll have to get it fixed upstream first)

I also just saw this but I am pretty sure I didn't do anything harsh to the system which would have caused corruption. attaching shell log now

Changed 5 years ago by dsd

shell log

Changed 5 years ago by dsd

corrupted datastore

comment:20 Changed 5 years ago by dsd

e2fsck did not find any problems

comment:21 Changed 5 years ago by dsd

  • Action Needed changed from test in build to review

comment:22 follow-up: Changed 5 years ago by mikus

[I have no idea if the following is pertinent. Also, I can neither explain nor reproduce it.]

os47. I had a snapshot entry in Journal, left over from several reboots before. Hovered over that entry's icon in Journal, clicked on 'Erase' in palette - nothing happened. Repeated this - still nothing happened (entry was still in Journal). Clicked on arrowhead at right of entry (the one that was supposed to show me the details of the entry) - nothing happened. [Other entries in Journal *would* show me their details, when I clicked on their corresponding arrowheads.]

Figured I would capture a "clean look" of the Journal database - so I rebooted. But now the snapshot entry was *gone* from Journal !!

comment:23 Changed 5 years ago by sayamindu

comment:24 in reply to: ↑ 22 Changed 5 years ago by sayamindu

Replying to mikus:

[I have no idea if the following is pertinent. Also, I can neither explain nor reproduce it.]

os47. I had a snapshot entry in Journal, left over from several reboots before. Hovered over that entry's icon in Journal, clicked on 'Erase' in palette - nothing happened. Repeated this - still nothing happened (entry was still in Journal). Clicked on arrowhead at right of entry (the one that was supposed to show me the details of the entry) - nothing happened. [Other entries in Journal *would* show me their details, when I clicked on their corresponding arrowheads.]

Figured I would capture a "clean look" of the Journal database - so I rebooted. But now the snapshot entry was *gone* from Journal !!

Looks like a case of #9774 to me.

comment:25 Changed 5 years ago by Quozl

  • Resolution set to fixed
  • Status changed from reopened to closed

Duplicate #9774, fixed in os50.

comment:26 Changed 5 years ago by anonymous

  • Milestone 1.5-software deleted

Milestone 1.5-software deleted

Note: See TracTickets for help on using tickets.