Ticket #8976 (closed defect: fixed)

Opened 6 years ago

Last modified 5 years ago

Wireless activation doesn't work in 8.2.

Reported by: cjb Owned by: cscott
Priority: blocker Milestone: 8.2.1
Component: initramfs Version: not specified
Keywords: cjbfor9.1.0 8.2.1:+ Cc: kimquirk, epastorino, martin.langhoff
Action Needed: finalize Verified: no
Deployments affected: Uruguay Blocked By:
Blocking:

Description

Because /sbin/ifconfig isn't in the initrd anymore, so the check_call()s to it fail and we give up on wireless.

Change History

Changed 6 years ago by cscott

ifconfig seems to have gone missing in olpcrd 0.41; it was present in 0.40 and previous. olpcrd 0.41 was released 05 Jun 2008.

The 8.1.x series builds, from 703 to 714, all have olpcrd 0.40. So this is an 8.2-specific problem.

Changed 6 years ago by cscott

Root cause seems to be this changelog from busybox:

busybox (1:1.9.1-2) experimental; urgency=low

  * Readd sections gc to dynamic built binaries.
  * debian/config.udeb:
    - Fix remaining problems.
    - Disable ifconfig.
  * util-linux/mount.c:
    - Support relatime. (closes: #460824)

 -- Bastian Blank <waldi@debian.org>  Thu, 13 Mar 2008 14:25:53 +0100

This seems to have made it into debian unstable on 22 Mar 2008 and olpcrd 0.41 was the first olpcrd built after that date (0.40 was built on Jan 30 2008).

It seems that /sbin/ifconfig has been replaced with /bin/ip; ip link msh0 up should be equivalent to ifconfig msh0 up and ip addr add ADDRESS dev msh0 should be equivalent to ifconfig msh0 add ADDRESS (use ip -6 for IPv6 addresses).

Testing is required.

Changed 6 years ago by kimquirk

  • milestone changed from Not Triaged to 8.2.1
  • deployment_affected set to Uruguay

Changed 6 years ago by cscott

Confirming I'm the right assignee for this. Fix should be straightforward, just a matter of tuits.

Changed 6 years ago by mstone-xmlrpc

  • keywords cjbfor9.1.0 added
  • milestone changed from 8.2.1 to 9.1.0

Pushing out to 9.1.0, per edmcnierney's request.

Changed 6 years ago by mstone

  • next_action changed from never set to design
  • milestone changed from 9.1.0 to 8.2.1

Scott,

I've got a udeb for net-tools that I used in UY to confirm my diagnosis which you're welcome to use; alternately, you could build your own or port the code to use /bin/ip.

What's your preference?

Changed 6 years ago by mstone

After discussion with Scott (and with help from Scott, Chris, and Michailis), I ported the activation code to the new /bin/ip API. In the process, I discovered that the most complicated logic in the networking activation code was unnecessary, so I also created and tested a patch which removes it. See my activation branch for details.

Debugging notes:

  'ip addr' is useful for getting all the address information 
       you could possibly want.
  'ping6 -I <iface> -r <addr>' is a helpful debugging tool.
  the mesh protocols implemented by the p14 and p18 Marvell 
       wireless firmwares are not interoperable in my activation 
       testing.

Changed 6 years ago by mstone

|TestCase|

1. generate an activation lease for your test machine

2. prepare a lease server by flashing 767 onto an XO, then

wget http://teach.laptop.org/~mstone/ivan-act-server.tar.gz
tar xvf ivan-act-server.tar.gz
mv lease.sig actserv/lease.sig  # move your test lease into place
cd actserv
./start-activation-server.sh

3. prepare a devkey for your test machine and remove the test machine's 'ak' flag, if necessary

4. prepare a USB stick with the new initramfs, e.g. 'u:\initrd.gz' in OFW-notation

5. boot your machine from the OFW ok prompt by entering the following lines:

" ro root=mtd0 rootfstype=jffs2 console=tty0 fbcon=font:SUN12x22 activate" to boot-file
" n:\boot\vmlinuz" to boot-device
" u:\initrd.gz" to ramdisk
boot

The test passes if the test machine successfully acquires its activation lease and boots.

Changed 6 years ago by mstone

  • next_action changed from design to review

Scott - once you've reviewed these patches, would you mind generating an official initramfs based on the patches for testing? Thanks!

Changed 6 years ago by mstone

  • cc epastorino added

Changed 6 years ago by martin.langhoff

  • cc martin.langhoff added

Changed 6 years ago by edmcnierney

  • keywords 8.2.1:+ added

Changed 6 years ago by cscott

  • next_action changed from review to diagnose

mstone: I've integrated your patches, but by the report in your commit log, you still haven't fixed IPv4 activation (you just removed it, instead). Working on fixing that still.

Changed 6 years ago by cjb

Scott: I think Michael's under the impression that no-one's using (or would have good reason to use) IPv4 activation.

Changed 6 years ago by cscott

  • next_action changed from diagnose to test in build

Yes, but I haven't heard any evidence for that assertion. What *I* keep hearing from Martin L is all the reasons why IPv6 can't possibly work in the field: incompatible access points, etc, etc. It shouldn't be hard to actually fix IPv4, it seems like the mstone patch is just missing an appropriate route configuration.

I've built olpcrd-0.49 with Michael's patch, my blind guess at an IPv4 fix, and a terribly minor activation GUI tweak (fedora branding), and uploaded it to staging. Please test.

Changed 6 years ago by mstone

The evidence is simply that I ran it through pdb and discovered that the connect() call with ipv4 addresses uniformly failed. Try it yourself; perhaps you'll get a different result. It's entirely possible that my test methodology is broken.

Anyway, "incompatible access points" is a completely bogus objection for _this specific bug_ because this bug is about pulling leases over the mesh, which certainly supports IPv6, not over APs. (Moreover, aren't access points pure layer 2 devices? If so, wouldn't they see only ethernet frames? Perhaps Martin meant routers with APs attached?)

Finally, while I agree that it should be easy to fix ipv4, it's even easier to _remove dead code_. Do we have a customer who demands IPv4 support?

Changed 6 years ago by martin.langhoff

At this moment we _know_ that xs-0.5 and xs-0.4 serve leases correctly over IPv4. I don't think they serve leases correctly over IPv6, and it's been definitely not tested.

So the _shipping functionality_ of the XO+XS combo is with IPv4, and it's over IPv4 that we're worried about the regression.

Changed 6 years ago by mstone

This is another strawman: this fix is for UY and UY doesn't use the OLPC XS; they use Ivan's activation server (cited above in my testcase), which is what I tested with.

Changed 6 years ago by mstone

(It's also a strawman because Scott already wrote patches which claim to fix IPv4 routing, even though he didn't tell anyone about them or ask for review.)

Please test!

Changed 6 years ago by martin.langhoff

Happy to test. Where do I grab a signed image to test this? Or... is there a way to test without a signed image somehow? My "8.2" test round failed because I didn't realise I needed a signed image, so I tested the "other" installed OS which was signed.

In return, I promise to *also* test and debug IPv6 activation as well so as to remove kinks on the XS side.

Changed 6 years ago by martin.langhoff

(I had previously attempted to post this -- re-posting for posterity as it's outdated now.) Michael, If you are going to cut the definition of the bug so tightly, we have to open other regression bugs and we can argue for a long time over what probably amount to a single line fix.

Ironically, Ivan's code is not something we formally support either.

If it can be fixed with reasonable effort and with no added risk, regression should be fixed so IPv4 and IPv6 both work. Only if supporting IPv4 is a major problem it makes sense to take the tighter definition of the bug Michael is proposing.

Changed 6 years ago by mstone

Step 5 of my test-case above explains how I run the activation code on an unlocked machine. (You might also have to edit one or two variables inside activate.py in the initramfs in order to get it to work.) Alternately, for an even better test, get a devkey for a machine, flash it, remove its AK flag, lock it, and test there.

I believe that the 'staging-7' build contains all of Scott's patches.

Changed 6 years ago by martin.langhoff

Ah, thanks for the pointer. I can't find the initrd we are testing... olpcrd and olpcrd-skel want specific paths (is ~/Projects something CScott has?), and debian utilities that I have but are definitely not the same as CScott is using in his build.

In short, I'm trying to get my hands on the exact initrd you guys are testing, rather than build my own with a different toolchain. CScott seems to be building this on an unstable/lenny box... is that frozen? (IOWs, is it a reproduceable build?)

For the time being -- downloading the tar.gz of the XO build to see if I can find it there. Pointers welcome if there's an easier way.

Changed 6 years ago by mstone

I was saying that I believe that builds including staging-7 and later contain an initrd worth testing.

See http://wiki.laptop.org/go/Building_initramfsen for instructions on how to remake that initrd.

And no, the initrd is not built in a controlled environment. (Hence this bug.)

Changed 6 years ago by martin.langhoff

Confirmed - works over IPv4 with Active Antennas against 0.5.x servers. Tested using MStone's procedure, with the olpcrd.img file extracted from the staging-7 image published earlier.

Will need to be tested with a final signed image as well once we have one.

Changed 5 years ago by dsd

  • next_action changed from test in build to finalize

Confirmed working using ivan-act-server on an XO. Activated 10 secured laptops that were running the signed version of staging-21 (freshly installed via NANDblaster). Worked perfectly. Looking forward to using this combination in the field!

Changed 5 years ago by martin.langhoff

  • status changed from new to closed
  • resolution set to fixed

This has been in 8.2.1 for a while now :-)

Note: See TracTickets for help on using tickets.