Opened 7 years ago

Closed 7 years ago

#4805 closed defect (invalid)

OLPC laptops interfering with other 802.11b/g traffic

Reported by: kimquirk Owned by: mbletsas
Priority: high Milestone: Update.1
Component: wireless Version:
Keywords: Cc: carrano, yani, javier, rchokshi, kimquirk, lac@…, jg, dilinger, satch@…
Blocked By: Blocking:
Deployments affected: Action Needed:
Verified: no

Description (last modified by kimquirk)

Email from a conference attendee (Hackers, nov 2007):

I am at a conference.
there are many OLPCs here. They are interfering with the wireless
here. They apparantly do this even when not officially trying to
be connected to the network,
just being powered on is enough.

When we get a report from the people who brought our own network
and are running it about what the heck is wrong with the fool things
we can decide whether to ban them from PyCON or if we have a
technical fix for them which we can require OLPC users to run.
or have a technical fix for our network.

Otherwise, 10 of the things eats a network which was designed to
comfortably hold several thousand simultaneous users, or 200
people like us who use streaming video a lot.

Attachments (1)

beacons.png (69.7 KB) - added by carrano 7 years ago.
graph intructions: black = total traffic; green = all beacons; blue = beacons coming from XOs

Download all attachments as: .zip

Change History (29)

comment:1 Changed 7 years ago by jg

  • Cc javier rchokshi kimquirk added
  • Milestone changed from Never Assigned to Future Release

Kim, we need more data.... Builds, symptoms seen on the network, access points at the convention. Otherwise we're not likely to be able to follow up on this, which would be a shame.

comment:2 Changed 7 years ago by rchokshi

Does this have to do with the fact that every XO will be transmitting beacons at every 100ms interval at 1Mbps data rate? If so, with the latest firmware drop and the latest driver, you can disable beacon transmission completely or at least reduce the frequency of the transmissions.

comment:3 Changed 7 years ago by jg

See also report at #4818

comment:4 Changed 7 years ago by wmb@…

Just to clarify - the problem was observed at the Hacker's Convention at Chaminade conference center near Santa Cruz, CA, not at PyCon. The reason that PyCon came up is because the reporter is at Hackers and is thinking ahead about what might happen at PyCon in March.

The reporter is Laura Creighton - lac@… . It is unclear that she is the best person to ask the specific questions above. From the tone of her message and another similar message to the hackers mailing list, I got the impression that she has experienced the effects of the XOs but is not herself responsible for either the XOs or the wireless infrastructure at the conference.

comment:5 Changed 7 years ago by jg

  • Cc lac@… added
  • Milestone changed from Future Release to Update.1
  • Priority changed from high to blocker

Without information beyond "They are interfering with the wireless
here", we're going to find this really tough.

https://dev.laptop.org/ticket/4805

What are/were the symptoms? Who can we talk to about them? What model(s)/version of wireless infrastructure does the conference have?

Any help you can give will be greatly appreciated, if only pointers to people we can find out this information from...

comment:6 Changed 7 years ago by jg

We think we might be seeing this elsewhere; it may be as simple as the mesh beacons that are still enabled at a high rate (about to be disabled), or it may be something else. We'll keep this trac entry updated as we gather information/make changes.

comment:7 Changed 7 years ago by wmb@…

From Laura Creighton: [ Contact Mitch if you need the email address alluded to below ]

wmb is correct about where I was, and that I am looking ahead for a
conference where I am an organiser. I was just a user of the network
in Santa Cruz. But I talked to the person who did run the network
last night. His name is Steve Satchell, and he is cc'd on this mail.

Satch, can you please repeat what you know about this problem
to this email so that the network volunteers of pycon-organizers,
who are on this list, can find out what went wrong, and so
that the OLPC people who are reading this ticket can find
out what went wrong too? If this ticket is a dup of a third
ticket, where you have already explained the details, can
you just point us at that ticket? I know about
http://dev.laptop.org/ticket/4818 already.

comment:8 follow-ups: Changed 7 years ago by wmb@…

Another Hacker's attendee - Dan Miller - adds: [ BTW, THINK is the alternative/sanitized name for the Hacker's Conference; Chaminade is the name of the conference venue ]

dunno if this is relevant, but -- while the THINK ssid was close to useless
(random disconnections within 30 seconds in the rare event that you could
get a connection at all), the Chaminade wifi worked flawlessly for me, once
I paid the $10 tythe. I needed a good connection to do my OpenSim demo.

Changed 7 years ago by carrano

graph intructions: black = total traffic; green = all beacons; blue = beacons coming from XOs

comment:9 Changed 7 years ago by carrano

Do we know what Access Point models were used in the PyCON conference? We should investigate WDS related issues.

On beacons:
Please take a look at the attached graph.

  • The black line accounts for the total traffic in bytes per second as captured in 1cc (Kim's office)
  • The green line is the traffic generated by beacons alone.
  • The blue line is traffic generated by beacons *coming from XOs*

The graph and the trace files indicate that the major source of beacons are Access Points.

And the trace file indicates that the rate of captured beacons were inferior than 20 beacons per second.

So, why we do not have a flood of beacons? (There is about 15 XOs in Kim's office alone)

Mihalis Bletsas just explained that the XOs backoff when they hear beacons.

comment:10 Changed 7 years ago by jg

  • Cc jg dilinger added

comment:11 in reply to: ↑ 8 ; follow-up: Changed 7 years ago by mbletsas

This completely rules out the beacon scenario - (which I always found hard to believe anyway) and leaves a lazyWDS interaction as the only option here (if it was the XOs that they were causing the trouble in the first place - a very big IF)

M.

Replying to wmb@firmworks.com:

Another Hacker's attendee - Dan Miller - adds: [ BTW, THINK is the alternative/sanitized name for the Hacker's Conference; Chaminade is the name of the conference venue ]

dunno if this is relevant, but -- while the THINK ssid was close to useless
(random disconnections within 30 seconds in the rare event that you could
get a connection at all), the Chaminade wifi worked flawlessly for me, once
I paid the $10 tythe. I needed a good connection to do my OpenSim demo.

comment:12 in reply to: ↑ 8 Changed 7 years ago by jcardona

Replying to wmb@firmworks.com:

Another Hacker's attendee - Dan Miller - adds:
(...) the Chaminade wifi worked flawlessly for me, once
I paid the $10 tythe. I needed a good connection to do my OpenSim demo.

Yet another indication that this is not a "medium saturation" but a problem on the AP side.
We must know what equipment they were using, as we already have long list of Lazy-WDS related problems that would explain those symptoms.

In parallel to that, as a preemptive move, someone from OLPC should ask Linksys (Cisco)/Broadcom(*) more info about how their APs/wireless cards respond to WDS frames. The xo's use standard WDS headers (i.e. conforming with IEEE Std. 802.11-2007, Sect. 7.2.2). If that is causing denial of service or association losses they must clearly explain why so we can find a work around.

(*) According to this entry in wikipedia the WRT54G uses Broadcom wireless modules.

comment:13 follow-up: Changed 7 years ago by satch89450

Hello, my name is Stephen Satchell. I'm the network guy at the THINK conference (previously known as the Hacker's Conference as organized by a group of people on The WELL). My responsibility is to oversee our conference demonstration rooms and to oversee the Internet connection we use at the conference, I believe the people who brought the little green laptops are already providing information, so this many be redundant. However, as we say here in Reno, Nevada, a triple check beats a double cross.

Background: the conference site has a network of four access points within the building in question. I don't have a lot of details about those access points, because another gentleman is the Wireless Czar for the conference...but I do know that the conference site's access points are set up to permit handoff fairly well -- if you want to pay the price to the hotel. The THINK network -- the official one -- also has four access points. Three of them have the same SSID, while one of them has a different one. The channel assignments were selected to minimize conflict with with Chaminade network. I counted two rogue access points (set up by conferees outside the "official network"). Then there was third rogue network with the name "mesh" in it...

I can relate my own experience with the problem, because my personal experience is what led us (me, and the OLPC laptop owners) initially to suspect that there was an interference problem. I have a Sony VIO running Windows XP (stop booing and hissing, I'm fixing that problem) that has built-in 802.11b wireless. I used this laptop on the NOC table to debug things on my network, both wired and wireless. Also on the NOC table was one of the official access points, with the THINK ssid. I could not connect to it. The OLPC units were across the hallway in the other demo room -- air distance of about 12 feet.

When all of the OLPC units were shut off, I was able to get an "excellent" connection (as reported by the VIO wireless software) with the access point six inches away. With the OLPC units open and turned on, I couldn't even access the wireless network.

People in the main conference room didn't have problems with OLPC interference (with SSID "hackers-m") until the units were brought in and passed around the room.

I understand from one of the two owners that two of the units did *not* have the most up-to-date software on it, so this could have been a problem that has already been fixed. (The irony is that this gentleman was trying to download the lastest version, but his laptop -- an Apple one, if memory serves -- was unable to complete the download. Catch-22!) Also, when the laptops had their networking disabled as claimed by the owner, the problem appeared to go away as well.

One side issue: the "network off" mode wasn't sticky. When the units reportedly turned off and then back on, the nework would come back up and not stay in the disabled condition.

The workaround I did was to go into my heap of stuff and yank out a couple of 24-port Ethernet switches with 100-base T ports, expand the wired network at the laptop tables, and extend wired Ethernet to a lounge area where laptop users like to congregate. This reduced the load on the wireless network.

I don't recall if the software was updated during the conference, or tested. As it was, I got five hours of sleep during 72 hours, so I wasn't as sharp as I would have liked to be. Also the OLPCs were not mine. (Should I fix that last? Hmmm...)

If you have questions, you can send them to satch@… and I will do my best to answer them. I would prefer you use my work address, as *everything* I had at the conference is currently at the office, including my contact lists.

comment:14 follow-up: Changed 7 years ago by jg

I'm trying to track down a contact at Linksys... I have one possible indirect contact. If you have a suggestion as to whom, that would be good.

Also, do we have a good description of the problem as observed written up, describing exactly what happens when we send a frame?

And satch, I think we can send you an XO for your trouble...
http://wiki.laptop.org/go/Developers_program

comment:15 in reply to: ↑ 13 Changed 7 years ago by jcardona

  • Cc satch@… added

Hi Stephen,

Thank you for the additional information.

Replying to satch89450:

The THINK network -- the official one -- also has four access points. Three of them have the same SSID, while one of them has a different one. The channel assignments were selected to minimize conflict with with Chaminade network.

Do you have a way to find out what's the brand, model and version of those APs? That would be very useful in resolving this problem.

comment:16 Changed 7 years ago by satch89450

I have sent a request to my Wireless Czar for this information. Can someone add satch@… to the cc: list on this bug, I'm so brain-dead right now...

comment:17 in reply to: ↑ 11 ; follow-up: Changed 7 years ago by AlbertCahalan

Note that the beacons.png graph is showing bytes, not transmit time.

If the mesh is at 1 megabit/second while regular laptops are at 11 or 54, then you really need to multiply the mesh traffic by 11 or 54 to account for that.

Low-speed traffic can be even worse. Under some conditions (forgot the details, sorry) it can make other stuff go at the slow rate as well.

comment:18 in reply to: ↑ 17 Changed 7 years ago by carrano

Replying to AlbertCahalan:

Note that the beacons.png graph is showing bytes, not transmit time.

If the mesh is at 1 megabit/second while regular laptops are at 11 or 54, then you really need to multiply the mesh traffic by 11 or 54 to account for that.

Low-speed traffic can be even worse. Under some conditions (forgot the details, sorry) it can make other stuff go at the slow rate as well.

Albert,
Yes, this is not an airtime graph. But, that's why I added the green line - an indicator that the beacons generated by other sources are more demanding, even tough the XOs outnumber them by far. Also note that these other beacons are also transmitted at 1Mbps. So we have more traffic generated by the beacons of the APs, than those of the XOs (and we probably have 50 XOs here).
My point is: we cannot simply multiply the traffic generated by the beacons of one XO per the number of XOs - because they back off when they hear another XO's beacons.
And, more important, the data frames are not being transmitted at 1Mbps, only the beacons (and this is exactly what other 802.11 nodes do).

comment:19 in reply to: ↑ 8 Changed 7 years ago by MarkHarrison

Replying to wmb@firmworks.com:

dunno if this is relevant, but -- while the THINK ssid was close to useless
(random disconnections within 30 seconds in the rare event that you could
get a connection at all), the Chaminade wifi worked flawlessly for me, once
I paid the $10 tythe. I needed a good connection to do my OpenSim demo.

I can confirm that the hotel paid wifi also worked for me during the time
in question.

comment:20 Changed 7 years ago by kimquirk

  • Description modified (diff)

comment:21 in reply to: ↑ 14 ; follow-up: Changed 7 years ago by jcardona

Replying to jg:

Also, do we have a good description of the problem as observed written up, describing exactly what happens when we send a frame?

We don't have any other description of 4805 other than what's on this ticket. Monitor mode was implemented exactly for the purpose of analyzing remotely these sort of scenarios. We now have to advertise it and make it easy for early users to capture traffic and attach it to tickets when they experience any problem with the network. One way to do that would be to map the script below to an easy to remember key combination (Magnifying Lense + Mesh View?).

SECS=30
CAPTURE_FILE=`date +%y%m%d%H%M`.cap
TRAFFIC_MASK=0x7

echo Capturing traffic for $SECS seconds...

echo $TRAFFIC_MASK > /sys/class/net/msh0/device/libertas_rtap
ifconfig rtap0 up
SLEEP_PID=$!
tcpdump -s 128 -i rtap0 -w $CAPTURE_FILE &> /dev/null &
PID=$!
sleep $SECS 
kill $TPID
echo Done.  Capture file is $CAPTURE_FILE

comment:22 in reply to: ↑ 21 Changed 7 years ago by jcardona

Network Manager needs to be off while capturing. Corrected:

SECS=30
CAPTURE_FILE=`date +%y%m%d%H%M`.cap
TRAFFIC_MASK=0x7

echo Capturing traffic for $SECS seconds...
killall NetworkManager
sleep 1
echo $TRAFFIC_MASK > /sys/class/net/msh0/device/libertas_rtap
ifconfig rtap0 up 
tcpdump -s 128 -i rtap0 -w $CAPTURE_FILE &> /dev/null &
PID=$!
sleep $SECS 
kill $PID
echo Done.  Capture file is $CAPTURE_FILE
NetworkManager

comment:23 Changed 7 years ago by rsavoye

Comments from the Hacker's Conference list that I posted this morning. Sorry, I've been on vacation, driving home 1300 miles from the conference. There were 2 B4s, and 1 B2 at the conference involved in this problem. I see satch gave ya'll most of the relevant info, but I'll add my comments anyway.

can somebody who knows post here an explanation on why the
OLPCs were rendering the wifi network unusuable, and if the

Several folks, including myself, spent time trying to track this down. I'll
add full details to the bug report on dev.laptop.org after I get home from
Hackers. (I've been driving back to CO) We believe the problem is related to
having multiple APs with the same SSID, all in a small area. (9 of them within
200 feet). I'll try to test this when I get back.

I've had OLPCs in other dense networks, like Chaos Camp, with zero problems,
but there all the APs had unique SSIDs. There were also multiple OLPCs. As
they were all running build 406, where the mesh code doesn't even fully work, it's
odd there was a problem at all.

rumour that they do it even when not connected to the network,
just by being powered on is true?

The problem was people kept rebooting the OLPCs, as they often didn't
know how to close running applications, which restarted the mesh network
after I had shut it down.

comment:24 follow-up: Changed 7 years ago by carrano

Hi!

There is one capital information (still) missing here.
What is the brand/model of the Access point?

There is a huge chance that we are dealing with a pretty known issue.

comment:25 Changed 7 years ago by cscott

  • Milestone changed from Update.1 to Ship.2

comment:26 in reply to: ↑ 24 Changed 7 years ago by satch89450

Replying to carrano:

Hi!

There is one capital information (still) missing here.
What is the brand/model of the Access point?

There is a huge chance that we are dealing with a pretty known issue.

Chaminade responded yesterday:

To answer your question about our APs, we have two models, Symbol

Spectrum24 which is an enterprise grade 802.11b AP. We also use the SMC
SMC2552W-G2 enterprise grade 802.11b/g AP.

I've not heard back from our Wireless Czar, but believe that some of the APs we deployed were Cisco.

comment:27 Changed 7 years ago by kimquirk

  • Milestone changed from Ship.2 to Update.1
  • Priority changed from blocker to high

Moving this to update1 for more testing; but we believe the WDS fix in ship2 should solve this problem.

comment:28 Changed 7 years ago by mbletsas

  • Resolution set to invalid
  • Status changed from new to closed

I am closing this one. The only way that XOs will interfere with other 802.11b/g traffic is if they do not follow 802.11b/g and to this day nobody has provided even indication of such behavior.

M

Note: See TracTickets for help on using tickets.