Opened 6 years ago

Last modified 6 years ago

#7341 new defect

touchpad is a little off the rails in olpc3

Reported by: dsd Owned by: dilinger
Priority: high Milestone: 8.2.0 (was Update.2)
Component: kernel Version: olpc-3
Keywords: olpc3-23:- Cc: dsaxena, smithbone
Blocked By: Blocking: #7393
Deployments affected: Action Needed: never set
Verified: no

Description

The touchpad doesn't work very well in the olpc3 builds. Decreasing sensitivity helps (#7211) but does not make the problem completely go away.

To reproduce, just move the cursor around the screen for a minute or so. Sooner or later the cursor will jump around erratically for a while, making it very hard to click on anything. Also sometimes when you lift your finger from the pad, the cursor jumps to some other position (making clicking hard - you position the mouse at the right location, but then it moves somewhere else when you are ready to click)

I understand there is a completely rewritten touchpad driver in this stream, so that would be the obvious suspect?

Change History (10)

comment:1 Changed 6 years ago by dsaxena

  • Action Needed set to never set
  • Cc dsaxena added

comment:2 Changed 6 years ago by dsd

  • Blocking 7383 added

comment:3 follow-up: Changed 6 years ago by dsaxena

  • Cc smithbone added

Spent some time watching packet logs while I think we might be overly-calibrating the TP with the heuristics in the driver (possibly due to HW issues?). Two specific cases:

  1. If I move my finger around quickly, I often trigger the "delta too large" recalibration event, which is set to a default of 60. If I increase this to 120 (/sys/modules/psmouse/parameter/ignore_delta), I drastically decrease the number of times a recalibrate is issued (almost 5 minutes of constant rapid movement) and only seem to trigger it after very drastic movements. This might be related to only receiving packets every 24ms that smithbone is investigating as what are perfectly acceptable movements may appear erroneous to us due to missing data.
  1. The other scenario is around the following function in the driver:
/*
 * This is my favorite touchpad hardware bug.  I'm entirely not sure what
 * triggers it (I've seen it triggered while the laptop was left on overnight,
 * but my cat could have very well been using it/sleeping on it).  However,
 * the touchpad will randomly get stuck in a state where it constantly spews
 * packets without a finger being on it.  A recalibration will fix it, but
 * without that it will go on for days (auto-recalibration doesn't catch it,
 * either).  The packets tend to either have the same coordinates, or be
 * 1px away from each other; ie, (283,139,6) -> (284,139,5) -> (285,139,5) ->
 * (286,139,6) -> (286,139,6) -> etc.  We have a number of workarounds here..
 */
static void hgpk_spewing_hack(struct psmouse *psmouse, struct hgpk_packet *p)

I believe this recalibration is being falsely triggered in certain cases where my finger moves very slowly or rests in one place but i release the pressure so the Z value goes down. For example, in the following trace:

l=0 r=0 p=0 g=1 x=89 y=202 z=15 m=0
l=0 r=0 p=0 g=1 x=87 y=206 z=15 m=0
l=0 r=0 p=0 g=1 x=89 y=202 z=15 m=0
l=0 r=0 p=0 g=1 x=89 y=202 z=15 m=0
l=0 r=0 p=0 g=1 x=89 y=202 z=15 m=0
l=0 r=0 p=0 g=1 x=89 y=202 z=15 m=0
# ... [300+ samples of above] ...
# lightly remove pressure
p=0 g=1 x=89 y=202 z=13 m=0
Recalibrating touchpad..

I removed finger pressure and tried to move but b/c of being the middle of recal, the behavior again goes a bit out of whack.

The basic issue with both cases above is that trying to move the pointer in the middle of recal triggers another recal (etc, etc) and all our data during this time frame is essentially bogus AFAICT.

If we can't differentiate between truly bogus data and bogus-looking but valid data, fixing this is going to be a painful exercise. :(

comment:4 Changed 6 years ago by dsaxena

  • Milestone changed from Never Assigned to 8.2.0 (was Update.2)
  • Priority changed from normal to high

comment:5 in reply to: ↑ 3 ; follow-up: Changed 6 years ago by dilinger

Replying to dsaxena:

Spent some time watching packet logs while I think we might be overly-calibrating the TP with the heuristics in the driver (possibly due to HW issues?). Two specific cases:

Keep in mind that these are hacks to work around flaky hardware; there's no way we're going to get this perfect. The best we can hope for is 'usable'.

  1. If I move my finger around quickly, I often trigger the "delta too large" recalibration event, which is set to a default of 60. If I increase this to 120 (/sys/modules/psmouse/parameter/ignore_delta), I drastically decrease the number of times a recalibrate is issued (almost 5 minutes of constant rapid movement) and only seem to trigger it after very drastic movements. This might be related to only receiving packets every 24ms that smithbone is investigating as what are perfectly acceptable movements may appear erroneous to us due to missing data.

The reason I made the threshold configurable is because I wasn't sure that 60px was a good value. Now, we're trying to balance buggy hardware behavior w/ what the user may actually do. Yes, you can trigger the threshold thing with very drastic movements, but is that really how the user is going to be using the touchpad? I expected more controlled, fluid movements. Of course, the occassional huge jump is expected, which is why it requires two huge jumps in a row to trigger a recalibration. However, that may not be sensitive enough, which is why I committed a change to the testing branch (iirc) to trigger a recalibration after just one huge jump. I was working w/ Richard at the time, and he was seeing jumpiness that just wasn't triggering with two jumps in a row.

And yes, the 24ms thing would certainly affect this workaround. A fixed EC is important; however, we've also got many machines out in the field w/ older firmware versions that will probably never get upgraded, so the workarounds will need to deal properly w/ the 24ms thing (unless we can force users to upgrade their firmware).

  1. The other scenario is around the following function in the driver:
/*
 * This is my favorite touchpad hardware bug.  I'm entirely not sure what
 * triggers it (I've seen it triggered while the laptop was left on overnight,
 * but my cat could have very well been using it/sleeping on it).  However,
 * the touchpad will randomly get stuck in a state where it constantly spews
 * packets without a finger being on it.  A recalibration will fix it, but
 * without that it will go on for days (auto-recalibration doesn't catch it,
 * either).  The packets tend to either have the same coordinates, or be
 * 1px away from each other; ie, (283,139,6) -> (284,139,5) -> (285,139,5) ->
 * (286,139,6) -> (286,139,6) -> etc.  We have a number of workarounds here..
 */
static void hgpk_spewing_hack(struct psmouse *psmouse, struct hgpk_packet *p)

I believe this recalibration is being falsely triggered in certain cases where my finger moves very slowly or rests in one place but i release the pressure so the Z value goes down. For example, in the following trace:

l=0 r=0 p=0 g=1 x=89 y=202 z=15 m=0
l=0 r=0 p=0 g=1 x=87 y=206 z=15 m=0
l=0 r=0 p=0 g=1 x=89 y=202 z=15 m=0
l=0 r=0 p=0 g=1 x=89 y=202 z=15 m=0
l=0 r=0 p=0 g=1 x=89 y=202 z=15 m=0
l=0 r=0 p=0 g=1 x=89 y=202 z=15 m=0
# ... [300+ samples of above] ...
# lightly remove pressure
p=0 g=1 x=89 y=202 z=13 m=0
Recalibrating touchpad..

I removed finger pressure and tried to move but b/c of being the middle of recal, the behavior again goes a bit out of whack.

Known issue, but again, I had to weigh expected use cases
against buggy hardware. Why would a user keep their finger resting on the touchpad for a long time? If we could distinguish between that and the spewing bug, that'd be great; but I've been unable to.

The basic issue with both cases above is that trying to move the pointer in the middle of recal triggers another recal (etc, etc) and all our data during this time frame is essentially bogus AFAICT.

Unfortunately, the recalibration procedure itself is not ideal..

If we can't differentiate between truly bogus data and bogus-looking but valid data, fixing this is going to be a painful exercise. :(

Well, yes. First we need to trigger hardware bugs that are, by definition, incredibly hard to trigger. Then, we need to inspect the packet data and figure out ways to identify when the touchpad hw has screwed up. You can approximate the touchpad spew bug by doing a 4-finger-salute while keeping a piece of tinfoil or rubber on the touchpad, and then removing it; 1 out of every 5 or 10 times, it will begin spewing packets.

comment:6 in reply to: ↑ 5 Changed 6 years ago by dsaxena

  • Blocking 7393 added

Replying to dilinger:

Replying to dsaxena:

  1. If I move my finger around quickly, I often trigger the "delta too large" recalibration event, which is set to a default of 60. If I increase this to 120 (/sys/modules/psmouse/parameter/ignore_delta), I drastically decrease the number of times a recalibrate is issued (almost 5 minutes of constant rapid movement) and only seem to trigger it after very drastic movements. This might be related to only receiving packets every 24ms that smithbone is investigating as what are perfectly acceptable movements may appear erroneous to us due to missing data.

The reason I made the threshold configurable is because I wasn't sure that 60px was a good value. Now, we're trying to balance buggy hardware behavior w/ what the user may actually do. Yes, you can trigger the threshold thing with very drastic movements, but is that really how the user is going to be using the touchpad? I expected more controlled, fluid movements. Of course, the occassional huge jump is expected, which is why it requires two huge jumps in a row to trigger a recalibration. However, that may not be sensitive enough, which is why I committed a change to the testing branch (iirc) to trigger a recalibration after just one huge jump. I was working w/ Richard at the time, and he was seeing jumpiness that just wasn't triggering with two jumps in a row.

testing does indeed have your change to recal after only 1 large delta. I'll play around with this to see if I can impact behavior. Maybe make this into a tunable too?

  1. The other scenario is around the following function in the driver:

I believe this recalibration is being falsely triggered in certain cases where my finger moves very slowly or rests in one place but i release the pressure so the Z value goes down. For example, in the following trace:

l=0 r=0 p=0 g=1 x=89 y=202 z=15 m=0
l=0 r=0 p=0 g=1 x=87 y=206 z=15 m=0
l=0 r=0 p=0 g=1 x=89 y=202 z=15 m=0
l=0 r=0 p=0 g=1 x=89 y=202 z=15 m=0
l=0 r=0 p=0 g=1 x=89 y=202 z=15 m=0
l=0 r=0 p=0 g=1 x=89 y=202 z=15 m=0
# ... [300+ samples of above] ...
# lightly remove pressure
p=0 g=1 x=89 y=202 z=13 m=0
Recalibrating touchpad..

I removed finger pressure and tried to move but b/c of being the middle of recal, the behavior again goes a bit out of whack.

Known issue, but again, I had to weigh expected use cases
against buggy hardware. Why would a user keep their finger resting on the touchpad for a long time? If we could distinguish between that and the spewing bug, that'd be great; but I've been unable to.

Note that I did reproduce this at times by just resting for a second at the end of a stroke in one direction.

If we can't differentiate between truly bogus data and bogus-looking but valid data, fixing this is going to be a painful exercise. :(


Well, yes. First we need to trigger hardware bugs that are, by definition, incredibly hard to trigger. Then, we need to inspect the packet data and figure out ways to identify when the touchpad hw has screwed up. You can approximate the touchpad spew bug by doing a 4-finger-salute while keeping a piece of tinfoil or rubber on the touchpad, and then removing it; 1 out of every 5 or 10 times, it will begin spewing packets.

So we need to figure out what to do next. With 4 weeks until August, we need to either fix the issues or we need to see if moving the stable driver into the testing kernel will be a simpler solution for now.

On my end, I will:

  1. Post something on devel to get more input from people on what behavior they are seeing with the TP.
  2. Build a 2.6.25 kernel with the old driver to play around with and see what the behavior is like there.

comment:7 Changed 6 years ago by dsd

  • Blocking 7383 removed

Dropping block on #7383 because this seems linked to the kernel and not F9 specifically

comment:8 follow-up: Changed 6 years ago by tomeu

In joyride-2110, the mouse looks to move too slow now.

comment:9 in reply to: ↑ 8 Changed 6 years ago by dsaxena

Replying to tomeu:

In joyride-2110, the mouse looks to move too slow now.

As per #7211, you need to edit /usr/bin/olpc-session and change the "xset m" setting back to 7/6.

comment:10 Changed 6 years ago by bobby

xset m 7/4 1
works GREAT for me in 2130

Note: See TracTickets for help on using tickets.