Ticket #5705 (closed defect: fixed)

Opened 7 years ago

Last modified 5 years ago

Everything breaks when we fail D-Bus' <at_console> authorization check.

Reported by: cjb Owned by: mstone
Priority: blocker Milestone: 8.2.0 (was Update.2)
Component: distro Version:
Keywords: rainbow-integration, security Cc: mstone, cscott, bernie, mtd
Action Needed: package Verified: no
Deployments affected: Blocked By:
Blocking:

Description

After using sudo on tty1 for a while on Joyride, and switching back and forth between tty1 and X, I started to see an error messsage along the lines of "can't access tty" when running e.g. "sudo ls" on tty1, and sudo refused to grant me root.

I also noticed that "ssh user@foo" fails with "Permission denied" straight away -- adding -v showed that ssh is unable to open /dev/tty (to ask for my password, presumably).

Until this is fixed, I think it might be unwise to continue using sudo as a replacement for su.

Change History

  Changed 7 years ago by jg

  • cc mstone added
  • keywords rainbow-integration, security added
  • priority changed from normal to high
  • milestone changed from Never Assigned to Update.1

  Changed 7 years ago by cjb

  • cc cscott, bernie added

Raising to blocker, as asked by jg.

  Changed 7 years ago by cjb

  • priority changed from high to blocker

  Changed 7 years ago by bernie

Please, provide more information on how to reproduce it.

  Changed 7 years ago by cjb

Please, provide more information on how to reproduce it.

Hm. You came up with what seemed like a good conclusion to me, which was that there's no locking of /dev/tty permissions, and X can change them out from under the tty. We didn't test the theory, though.

I don't have good instructions for reproducing; it's happened to me several times on several different reboots, usually when heavily using the tty but also switching back and forth to X, possibly restarting X, etc.

  Changed 7 years ago by jg

  • owner changed from jg to bernie

  Changed 7 years ago by cscott

#5537 may be switching back to su under the covers; this may (or may not) make the problem go away.

  Changed 6 years ago by cscott

  • owner changed from bernie to cscott
  • next_action set to never set

cjb, can I assume that this problem has been fixed and close this bug?

follow-up: ↓ 10   Changed 6 years ago by cjb

I think we still this sometimes, and are going to continue to see it while olpc-dm is owning tty1.

in reply to: ↑ 9   Changed 6 years ago by mtd

  • cc mtd added

Replying to cjb:

I think we still this sometimes, and are going to continue to see it while olpc-dm is owning tty1.

This commit fixes olpc-dm owning tty1 for me. The commit msg has w & ps output to confirm

  Changed 6 years ago by cjb

Hey, Scott and Michael, check out that patch. I'd like to nominate it for 8.2, since it fixes a blocker. Could we get it into Joyride after this week's weekly build, perhaps?

follow-up: ↓ 13   Changed 6 years ago by cscott

mtd: can I get some more information about how your patch is expected to work? Why were we originally setting tty to tty2? If this was previously completely broken, why didn't we notice? Why is removing the assignment of controlling tty the right thing to do? If I understand correctly, this code was originally lifted from gdm -- why is this the right thing for gdm to do, but not for us?

Removing code like this makes me very nervous absent a good explanation.

in reply to: ↑ 12   Changed 6 years ago by mtd

Replying to cscott:

mtd: can I get some more information about how your patch is expected to work?

This ticket observes that olpc-dm owns tty1 and that isn't desired. The way to relinquish ownership of a tty whilst forking is clearly laid out as steps 2 and 3 in http://www.steve.org.uk/Reference/Unix/faq_2.html#SEC16 , and not-so-clearly laid out as a consequence of doing step 6 (which is what wasn't being done by olpc-dm.c, on purpose (see my answer to your "Why is [this/step 6] the right thing to do", belo), and what my patch enables).

Why were we originally setting tty to tty2?

I don't know. AFAICS from the git history we've never changed that line of code. But I know we're supposed to be running X on tty3, right? And my F9 gdm-binary / gdm-simple-slave do something very similar to that. You'll forgive me if I skimp on answering this historical question to draw your attention to my answer to your final two questions.

If this was previously completely broken, why didn't we notice?

Well this ticket's 7 months old. And one wouldn't notice it being "broken" unless you try to use one of tty{1,2}. ibid. re historical questions before my time.

Why is removing the assignment of controlling tty the right thing to do?

Because we don't know what tty we're started on, and we should be started on tty3. Starting on tty3 is right by definition (I suppose, though I don't have a Sugar/XO design document in mind). For a stronger argument: because we don't know what tty we're started on, and we *should* be started on a well known tty (no design document references again, sorry, but this seems really common sense to me).

If I understand correctly, this code was originally lifted from gdm -- why is this the right thing for gdm to do, but not for us?

I'm not totally clear on my olpc-dm was meant to do, but just that the problem - olpc-dm should not grab the same tty as another mingetty does - is solved. Of course more problems could be introduced in some subtle ways, so I'm glad people more aware of olpc-dm's goals are thinking about this.

Removing code like this makes me very nervous absent a good explanation.

I understand, but even if you want to believe that tty2 was chosen on purpose and that olpc-dm should be running on the same tty as one of the XO's mingetty's, consider the two - three, by some count - large scary comments in olcp-dm.c near my patch's changed lines, saying, roughly "well this area of code might be causing [exactly the brokenness mtd's patch fixes]". Just as some small comfort.

I found re-visiting how to properly double fork (which, FTR, I believe login.c's code *was* achieving for its use case, but ours is different because we've not got ourselves a good tty from mingetty or similar) at http://www.steve.org.uk/Reference/Unix/faq_2.html#SEC16 especially fun, so I welcome any illuminations about what this code should (or even was intended to ;)) be doing and don't pretend I know all the context in which my patch would be executed.

That being said, I'm pretty unconvinced of any arguments that tty2 was chosen explicitly and that the behavior of grafting on to a mingetty-managed terminal is desirable.

follow-up: ↓ 15   Changed 6 years ago by mstone

Regarding the question of shippability: I'm certainly interested in the patch though I'd like to see some more testing results first.

Regarding the question of correctness: Why should we be hardcoding the tty number at all? Shouldn't the choice of tty be passed as a commandline argument/env var or detected at run time?

(In my mind, a correct implementation would respond gracefully to the addition of new mingetty's ttys both before and after the X tty.)

in reply to: ↑ 14 ; follow-up: ↓ 22   Changed 6 years ago by mtd

Replying to mstone:

Regarding the question of shippability: I'm certainly interested in the patch though I'd like to see some more testing results first.

If I can assist let me know. As you can see from the patch I've included test cases, and I've been running it (without problems) since I came up with the patch. Of course that's a very small amount of time. I'm not an expert in testing these things to know what're the most fruitful areas to test, and I don't have enough time to become such an expert.

Regarding the question of correctness: Why should we be hardcoding the tty number at all?

Slight nitpick - that might be more flexibility rather than correctness. But regardless, it just depends on what you're aiming for and how long you want to carry olpc-dm. If not for too long, then who cares? As as one of cscott's queries implied, if it was wrong for this long and nobody cared too much, this small (but clear) enhancement in maintainability might not justify the additional complexity (changing invocation scripts/documentation, etc.).

(In my mind, a correct implementation would respond gracefully to the addition of new mingetty's ttys both before and after the X tty.)

I agree that would be nicer. I'm not clear on how to discover free ttys, though, especially as I'm not clear what assumptions our approach to configuring ttys via upstart allows us to make about the relative timings of tty allocations vs. prefdm/olpc-dm invocation.

As a meta-comment I'd return to my "how important is improving olpc-dm?" question: it sounds like you guys have thought about this a lot (or are willing to), in which case random contributors without much *dm experience are just going to drag you down at the design phase, IMO.

  Changed 6 years ago by cscott

Given mtd's explanation (thanks a lot, by the way, it was very helpful), my guess is that olpc-dm was originally written to be started from the initscripts (in tty0?), in which case it probably needed to grab a specific tty. Around the time we moved to upstart, olpc-dm started being invoked from /etc/inittab instead, which gave it a specific controlling tty already and so it no longer needed to grab one.

I'm going to add some of mtd's excellent explanation above to olpc-dm.c as a comment, and then commit the patch. If we ever need to start a display on more than one vt, then we'll know what to do. =)

  Changed 6 years ago by cscott

  • owner changed from cscott to mstone
  • next_action changed from never set to package

Michael: http://dev.laptop.org/git/users/cscott/olpc-utils-tmp

That's mtd's patch plus some additional explanations and cleanup.

  Changed 6 years ago by cscott

This went into joyride-2245, and the network mesh view promptly broke. joyride-2243 works, joyride-2245 (with just olpc-utils changed) doesn't.

Anyone want to venture a guess why changing the controlling tty of X breaks the network? One hypothesis: dbus access control is breaking, and it's ConsoleKit's fault?

  Changed 6 years ago by mtd

It seems sugar's hardwaremanager.py can't get nmclient.py to get a dbus connection/proxy object to NetworkManager. As you / m_stone suggested, it seems consolekit / pam don't do what dbus's "at_console" policy rule expects (my terminology here is weak, sorry). See https://bugs.freedesktop.org/show_bug.cgi?id=14053 for an example of the fix.

I did this as root and it fixed the problem:

cd /etc/ConsoleKit/run-session.d
wget --no-check-certificate -O pam-foreground-compat.ck https://bugs.freedesktop.org/attachment.cgi?id=17717
init 6

  Changed 6 years ago by cscott

Michael: http://dev.laptop.org/git?p=users/cscott/olpc-utils-tmp;a=commitdiff;h=7bd5be68b6ee9522f76aae9e56d43c48f4dc2efa

Tested and it seems to fix the problems with the network view. It also seems to make usb mounting while logged into a vt work again.

  Changed 6 years ago by mstone

  • blocking 7797 added

(In #7797) Seems that #5705 was at fault. We will need to retest when it is fixed.

in reply to: ↑ 15   Changed 6 years ago by mikus

Replying to mtd:

Replying to mstone:

Regarding the question of correctness: Why should we be hardcoding the tty number at all?

Slight nitpick - that might be more flexibility rather than correctness. But regardless, it just depends on what you're aiming for and how long you want to carry olpc-dm. If not for too long, then who cares? As as one of cscott's queries implied, if it was wrong for this long and nobody cared too much, this small (but clear) enhancement in maintainability might not justify the additional complexity (changing invocation scripts/documentation, etc.).

A minor comment, for the sake of recording what I've seen -- I'm not asking for any code changes.

I'm one of the people who have issued 'startx -- :1', and have (not very successfully) ended up in Sugar on ctl-alt-F4. I think if the tty is hard-coded, that impacts people who want to simultaneously run more than one "user" on the OLPC.

  Changed 6 years ago by mstone

  • summary changed from sudo gives "can't access tty" errors. to Everything breaks when we fail D-Bus' <at_console> authorization check.

olpc-utils-0.83-1.olpc3 built in koji with Scott's/Martin's fix. Please retest.

  Changed 6 years ago by mikus

  • blocking 7797 removed

(In #7797) This ticket is getting long in the tooth. I doubt if further changes to Rainbow in 8.2.x are planned (and many of the cited Activites now run fine there). If some Activities fail with 9.1, individual tickets can be written for them. Time to close this "excessive scope" ticket.


Note: Since I originally wrote this ticket, #5705 has been added as a blocker -- that relationship may or may not be warranted. But at the time when I wrote #7797, what I had in mind were the Activities which stored in places like /home/olpc (when there was no Rainbow to prohibit that). #5705 mainly concerns which tty an activity runs under -- I believe that #7797 does not have to be kept open while pursuing that problem.

  Changed 5 years ago by dsd

  • status changed from new to closed
  • resolution set to fixed

this was successfully fixed above

Note: See TracTickets for help on using tickets.