Opened 3 years ago

Closed 3 years ago

#11363 closed defect (fixed)

XO-1.75 mouse sometimes unusable on boot

Reported by: dsd Owned by: saadia
Priority: normal Milestone: 1.75-software
Component: kernel Version: not specified
Keywords: Cc: greenfeld
Blocked By: Blocking:
Deployments affected: Action Needed: no action
Verified: no

Description

Sam showed me an XO-1.75 which booted up with the mouse cursor unresponsive to pad actions. dmesg showed:

psmouse.c: Failed to deactivate mouse on olpc_touchpad/serio0
psmouse.c: Failed to enable mouse on olpc_touchpad/serio0
psmouse.c: Failed to deactivate mouse on olpc_touchpad/serio0
psmouse.c: Failed to enable mouse on olpc_touchpad/serio0
psmouse.c: Failed to deactivate mouse on olpc_touchpad/serio0
psmouse.c: Failed to enable mouse on olpc_touchpad/serio0

rmmod/modprobe psmouse solved the issue, but the issue reappeared immediately upon reboot.

olpc-keyboard driver should be modified to add a 'debug' parameter like i8042 so that we can get a better handle on this.

Attachments (6)

bug11363-cap1.txt (60.1 KB) - added by saadia 3 years ago.
bug11363-cap2.txt (60.5 KB) - added by saadia 3 years ago.
dmesg (16.7 KB) - added by godiard 3 years ago.
messages (34.2 KB) - added by godiard 3 years ago.
log-wo-rcd.txt (62.6 KB) - added by avatar 3 years ago.
os18 boot log
resume-reload.txt (551.6 KB) - added by avatar 3 years ago.
os20 reloading psmouse.ko after pm-suspend

Download all attachments as: .zip

Change History (21)

comment:1 Changed 3 years ago by dsd

arm-3.0-wip b67f6bf adds a debugging parameter.

When this can be reproduced, olpc.fth should be modified to add the olpc_keyboard.debug=1 kernel parameter, then reboot, and assuming it reproduced again, we should have some useful diagnosis in the kernel logs.

The parameter can also be accessed with:

echo 1 > /sys/module/olpc_keyboard/debug

but in this particular case, running the above after boot will be too late.

comment:2 Changed 3 years ago by saadia

I have captured a log with this debug information (attached as bug11363-cap1.txt). Once again (as seen in Bug 11357) the crash is a result of ps2_command during psmouse_connect.

Changed 3 years ago by saadia

comment:3 Changed 3 years ago by saadia

Previously added attachment is not relevant to this bug, but to bug 11357

comment:4 Changed 3 years ago by saadia

I reproduced this bug with the debug flag on. Log is attached in bug11363-cap2.txt. The first failure from psmouse.c is at 16.062

Changed 3 years ago by saadia

comment:5 Changed 3 years ago by saadia

  • Owner set to saadia
  • Status changed from new to assigned

comment:6 Changed 3 years ago by godiard

I have found it with http://dev.laptop.org/~dsd/.os883/xo-1.75try2/

I tried doing rmmod psmouse / modprobe psmouse and later adding echo 1 > "/sys/module/olpc_keyboard/parameters/debug" and did not resolved the issue.

/var/log/dmesg and /var/log/messages attached

Changed 3 years ago by godiard

Changed 3 years ago by godiard

comment:7 Changed 3 years ago by greenfeld

Still an issue with 3.0.0_xo1.75-20111202.0908.olpc.4750b50; encountered on a membrane C1 unit.

comment:8 Changed 3 years ago by pgf

i have two C1 machines which exhibit this issue repeatedly. it can be reliably reproduced with repeated "modprobe -r psmouse; modprobe psmouse" operations, which cause a high volume of bidirectional mouse traffic. when the problem hits, the mouse effectively locks up, and will apparently not respond to any more commands. both machines have AVC touchpads.

working with one of these machines, it seems that running q4c04 (and ec 0.3.04) has a _much_ lower occurrence rate for the problem than running q4c05/0.3.05. nothing jumps out at me from the changes in either the OFW/CForth changelogs, or in the EC changelog. unfortunately, the earlier firmware isn't completely free of the bad behavior -- it's just much harder to provoke (though i'd love corroboration of this.)

another interesting symptom, presumably related to bad communication with the mouse, is that the touchpad is sometimes misidentified. correct identification looks like:

FSPPS/2 Sentelic FingerSensingPad

but it may become

ImExPS/2 Sentelic FingerSensingPad

or

PS/2 Sentelic FingerSensingPad

or

PS/2 Generic Mouse

also, my two pads identify differently:

 Finger Sensing Pad, hw: 13.2.100, sw: 1.0.0-K, buttons: 4

and

 Finger Sensing Pad, hw: 13.2.1, sw: 1.0.0-K, buttons: 4

comment:9 Changed 3 years ago by jnettlet

I just got a string of reproducible conditions and this is what I found. Each time I soft-rebooted I would check test /mouse in ofw and the touchpad worked fine there. Then booting to linux would lead to this output in dmesg

input: PS/2 Generic Mouse as /devices/platform/olpc-kbd.0/serio1/input/input7
psmouse.c: Failed to enable mouse on olpc_touchpad/serio0

after playing around a bit with the protocols the mouse started working again, but because of anything else I did. The big difference I could find was that the device path for the working mouse was.

input: PS/2 Generic Mouse as /devices/platform/olpc-kbd.0/serio1/input/input4

wonder what is going on make the device enumerate 3 additional device numbers?

Changed 3 years ago by avatar

os18 boot log

comment:10 Changed 3 years ago by avatar

log-wo-rcd.txt is another failure case I was running into. It turns out that the SoC didn't respond to PS/2 commands. For example, the 0xe9 command at 8.107605 never got response when compared with 8.030017, which received reply within 10ms.

The same failure repeated again since 208.106654(when fsp_onpad_vscr() was invoked). Even worse, the subsequent 0xf3 command initiated by psmouse_set_rate(808.163444) caused another timeout in olpc_keyboard.c.

OTOH, there's a stray 0xfa at 812.308753, which probably is the response to the timed out 0xf4 sent at 811.256076; however, psmouse.c already considered there's an 'enable mouse failure' at 812.246668.

According to various logs I've read, it looks to me that SoC somehow delayed or never replied to command sometimes(heavy load perhaps?). I suspect this could be related to the lost of control described in ticket/11357 as FSP wasn't initialised correctly or being took down by the 0xf5 command since certain command packet such like 0xf4 never reached to FSP.

comment:11 follow-up: Changed 3 years ago by avatar

By booting a kernel without mmp-camera module(CONFIG_VIDEO_MMP_CAMERA), I no longer get "psmouse.c: Failed to enable mouse on olpc_touchpad/serio" message anymore.

comment:12 in reply to: ↑ 11 ; follow-up: Changed 3 years ago by pgf

Replying to avatar:

By booting a kernel without mmp-camera module(CONFIG_VIDEO_MMP_CAMERA), I no longer get "psmouse.c: Failed to enable mouse on olpc_touchpad/serio" message anymore.

thank you for this report. i have a pending firmware change which will definitely help with this issue, but i'm at a loss as to how the camera might affect it. i'll investigate that.

fyi, a much faster way to reproduce the problem, without rebooting, is:

modprobe -r psmouse; modprobe psmouse

this is known to fail completely after the laptop has suspended and resumed at least once, so it's a more interesting test before the first suspend.

comment:13 in reply to: ↑ 12 Changed 3 years ago by avatar

Replying to pgf:

Replying to avatar:

By booting a kernel without mmp-camera module(CONFIG_VIDEO_MMP_CAMERA), I no longer get "psmouse.c: Failed to enable mouse on olpc_touchpad/serio" message anymore.

thank you for this report. i have a pending firmware change which will definitely help with this issue, but i'm at a loss as to how the camera might affect it. i'll investigate that.

A wild guess -- could camera initialisation took up too much CPU resource thus "blocking" certain PS/2 commands being issued to FSP?

fyi, a much faster way to reproduce the problem, without rebooting, is:

modprobe -r psmouse; modprobe psmouse

this is known to fail completely after the laptop has suspended and resumed at least once, so it's a more interesting test before the first suspend.

Thanks for the hint. FWIW, the log indicates that there're PS/2 command writing timeout during module reload:

[ 3892.744180] eth0: no IPv6 routers present
[ 3909.969624] olpc_keyboard: olpc_kbd_pad_write which=100 val=f5
[ 3910.164188] olpc_keyboard: olpc_kbd_pad_write which=100 val=f3
[ 3910.364186] olpc_keyboard: olpc_kbd_pad_write which=100 val=f4
[ 3910.564195] psmouse serio1: Unable get OPC state.
[ 3910.564233] olpc_keyboard: olpc_kbd_pad_write which=100 val=f5
[ 3910.764299] olpc_keyboard: olpc_kbd_pad_write which=100 val=f3
[ 3910.964748] olpc_keyboard: olpc_kbd_pad_write which=100 val=f4
[ 3911.964208] olpc_keyboard: olpc_kbd_pad_write timeout, status=103
[ 3911.964208] olpc_keyboard: olpc_kbd_pad_write which=100 val=f5
[ 3912.964179] olpc_keyboard: olpc_kbd_pad_write timeout, status=103
[ 3912.964208] olpc_keyboard: olpc_kbd_pad_write which=100 val=f3
[ 3913.964190] olpc_keyboard: olpc_kbd_pad_write timeout, status=103
[ 3913.964190] olpc_keyboard: olpc_kbd_pad_write which=100 val=f4
[ 3914.964166] olpc_keyboard: olpc_kbd_pad_write timeout, status=103
[ 3915.034718] olpc_keyboard: olpc_kbd_pad_open
[ 3915.034718] olpc_keyboard: olpc_kbd_pad_write which=100 val=f2
[ 3916.034169] olpc_keyboard: olpc_kbd_pad_write timeout, status=103

Changed 3 years ago by avatar

os20 reloading psmouse.ko after pm-suspend

comment:14 Changed 3 years ago by pgf

  • Action Needed changed from diagnose to package

i believe this will be fixed in firmware Q4C08.

comment:15 Changed 3 years ago by martin.langhoff

  • Action Needed changed from package to no action
  • Resolution set to fixed
  • Status changed from assigned to closed

This has been fixed, and in builds, for a while. We haven't seen psmouse protocol confusion in a while, so closing as fixed.

Please reopen if the issue reappears.

Note: See TracTickets for help on using tickets.