Ticket #5144 (new defect)

Opened 9 months ago

Last modified 7 days ago

[firmware] Marvell firmware should have controls for turning mesh forwarding on and off

Reported by: jg Owned by: jcardona
Priority: high Milestone: 8.2.0 (was Update.2)
Component: wireless Version:
Keywords: blocks-:8.2.0 joyride-2346:+ Cc: cjb, marco, jcardona, mbletsas, gnu, mtd, ashish, gregorio, carrano
Action Needed: qa signoff Verified: no
Blocked By: Blocking: #7879

Description

#5143 documents the need for this cahnge.

Ronak and Javier are working on this.

Attachments

usb8388_5_110_21_p1.tar.gz (82.4 kB) - added by mbletsas 9 months ago.
Firmware for Marvell 8388 WiFi
0002-MESH-START-STOP-IOCTLS.patch (19.8 kB) - added by mbletsas 9 months ago.
required IOCTLs for the new mesh start/stop functionality (as of 5.110.21.p1)
0001-Added-event-check-for-firmware-download.patch (2.1 kB) - added by mbletsas 9 months ago.
added event for completion of firmware download (ticket #4637)
disable_mesh_auto_start.patch (5.3 kB) - added by mbletsas 9 months ago.
self explanatory
0001_mesh_start_stop_firmware_ready_disabled_autostart.patch (25.1 kB) - added by mbletsas 9 months ago.
combined patch for all the previous files.

Change History

Changed 9 months ago by mbletsas

Firmware for Marvell 8388 WiFi

  Changed 9 months ago by mbletsas

Marvell's release notes

New Features/Bug fixes


1. Mesh start/stop control from the driver. By default mesh is off. 2. Firmware ready event after firmware is downloaded from host and initialized -- OLPC ticket #4637 3. Fix for OLPC ticket #4927.

Additional notes for driver etc.


1. Libertas driver patch is required to handle firmware ready event. 2. Libertas driver patch is required for mesh start/stop feature. 3. Changes introduced in the firmware does not allow certain services to run unless the already running service on a given channel stopped by the driver. This may need changes in the application, in particular, Network Manager that must stop the existing service before trying association with the other BSSs. 4. Firmware suspend-resume behavior. Driver needs to make sure that mesh is running before putting firmware into suspend state. In the absence of this, user needs to make sure mesh is running. The attached patch does not issue mesh start, if mesh is not running, prior to moving into suspend state. 5. All instances of mesh autostart enable/disable have been removed from the driver. This is as per mesh always on/start/stop design. Any such command will be returned NOT supported by the firmware. The existing driver, https://dev.laptop.org/git?p=olpc-2.6;a=snapshot;h=72714fb50756a417f0c0fde180e68b9480f034be does not check command return code properly and it's observed that initial firmware initialization fails with existing driver, due to 'not supported' command response.

  Changed 9 months ago by mbletsas

Mesh Start/Stop Behavior

1. Mesh Start/Stop This feature isolates the mesh network operation from adhoc/infra operation of the WLAN. With the help of mesh start/stop command, STA/XO can connect to mesh network and communicate over it, without connecting to any adhoc/infra network. With the help of the command, mesh operation can coexist with infra or adhoc operations, if both are on same RF channel. In mesh mode the STA/XO transmits the mesh beacon like adhoc beacon. The mesh beacon definition is same as adhoc beacon except the SSID and Mesh information element. In pure mesh mode (i.e. no infra or adhoc is on at the same time), the beacon consists of all elements of adhoc beacon with NULL SSID and mesh information element appended to it. Mesh information element is identified with Marvell vendor specific id and mainly consist of information mesh protocol id, metric id and mesh ID. Mesh ID is the mesh network name like essid. Since mesh mode can coexist with adhoc network, the beacon transmitted in the mode is adhoc beacon with mesh information element appended to it. There are two commands to control the mesh mode and one is for mesh status display. All of them are implemented with the help of iwpriv command. The control commands are mesh_start and mesh_stop. The mesh status command will return the status of the mesh. Mesh start takes two parameters, the channel on which user is intend to start the mesh and the Mesh ID.

Below is the syntax for mesh driver API.

iwpriv msh0 mesh_start "<channel> <mesh id>"

Examples: Mesh Start: To start mesh on channel 6 with mesh id "olpc-mesh".

$> iwpriv msh0 "6 olpc-mesh"

Mesh start command without any parameter will start the mesh on channel 1 with “olpc-mesh” as a mesh ID

$> iwpriv msh0 mesh_start

Mesh Stop: This will stop the mesh

$> iwpriv msh0 mesh_stop

Mesh Status: This will retune the status of the mesh. If mesh is on then it will display the channel and the mesh ID.

$> iwpriv msh0 mesh status

2. RF Channel Setting RF channel can be set, if and only if the host is not connected to any network. This default behavior is also applied to mesh mode operation. When STA/XO is connected to any network and channel change command is issued, then the STA/XO gets disconnected from the network. If the connected network is an adhoc or mesh then STA/XO restarts the adhoc or mesh on the new channel. The behavior of the commands to start/join the networks has been changed. Due to mesh start/stop command user can start mesh network along with infra or ad hoc network or vice versa. As mesh can co-exists with other network (ad-hoc/infra), STA/XO can join second network (ad-hoc/infra) only if second network is on same channel as the first network (mesh). If second network (ad-hoc/infra) channel is different from the first one (mesh) then second network join will fail.

When STA/XO is part of the adhoc/mesh/infra network, user can start mesh/ad-hoc network on the same channel as ad-hoc/mesh/infra. But if user tries to start mesh/ad-hoc on different channel, then ad-hoc/mesh network will be started on the old channel, on which first network is started.

Changed 9 months ago by mbletsas

required IOCTLs for the new mesh start/stop functionality (as of 5.110.21.p1)

Changed 9 months ago by mbletsas

added event for completion of firmware download (ticket #4637)

Changed 9 months ago by mbletsas

self explanatory

Changed 9 months ago by mbletsas

combined patch for all the previous files.

  Changed 9 months ago by jcardona

; The patches on this ticket were cleaned up and posted here:

http://lists.laptop.org/pipermail/devel/2007-November/007986.html http://lists.laptop.org/pipermail/devel/2007-November/007987.html

The only changes, other than formatting is in the handling of the firmware ready event. If that event does not arrive, the driver barks and tries to continue. This is to avoid breaking backward compatibility.

  Changed 9 months ago by kimquirk

  • priority changed from blocker to high
  • milestone changed from Ship.2 to Update.1

Moving to update1; we have a solution for ship2.

  Changed 9 months ago by kimquirk

  • milestone changed from Update.1 to Update.2

Fixes to 4470 and WDS are good enough for update1; moving to update2

  Changed 8 months ago by dwmw2

A saner implementation of mesh on/off control has been added to the new driver. There are still some issues with its interaction with AP mode (#5481), and I think we should still have the _firmware_ default to mesh off. The driver can enable it if it wants to.

  Changed 8 months ago by dwmw2

  • cc dwmw2 added

  Changed 8 months ago by dwmw2

  • summary changed from Marvell firmware should default to mesh off, and have controls for turning mesh on and off to [firmware] Marvell firmware should default to mesh off, and have controls for turning mesh on and off

  Changed 7 months ago by gnu

  • cc cjb, marco added

Is anyone working on the user interface for this?

It should clearly be possible for end users (e.g. G1G1) to easily turn off the mesh and leave it off permanently. This would not only avoid scrambling the brains of nearby buggy access points and/or other WiFi devices. It would also allow the entire wireless chip to be powered down during lid-close or power-button suspends (the sort that are not interrupted by incoming packets). This would probably *double* the battery life when the laptop is closed.

Having the mesh off would also significantly reduce the amount of transmitting that the laptop does, which would somewhat lengthen battery life during operation. This is good both for schools that use ordinary access points, and for G1G1 users.

Marvell doesn't support "powersave mode" when the mesh is on. I think this 802.11 mode coordinates between the access point and the wifi chip so that transmissions to the wifi chip occur in a pre-negotiated time slot when the wifi chip has committed to be listening; the rest of the time the wifi chip burns less power. See http://dev.laptop.org/ticket/5418#comment:7 .

Once we are using this control to disable the mesh, we can also measure the Libertas power consumption under more controlled circumstances, and seek firmware improvements that reduce its power consumption. Excessive transmission was one of the suspected causes for the chip's higher-than-expected power consumption, discovered in the Tinderbox.

  Changed 5 months ago by ashish

Mesh start/stop has been fixed in firmware version 5.110.22.p9. There was a problem in mesh stop logic in earlier versions of the firmware http://dev.laptop.org/ticket/6589

  Changed 4 weeks ago by gnu

  • cc jcardona, mbletsas, gnu added
  • keywords blocks?:8.2.0 added; jcardona, mbletsas removed
  • next_action set to never set

Joyride 2263 apparently does not include these fixes from 4 months ago. "iwpriv msh0 mesh status" reports: "invalid command: mesh".

As stated above, having mesh off by default improves battery life all by itself -- and enables minor changes elsewhere to radically improve battery life. Since most deployments are not using mesh, and many have power problems, we should get these power improvements in.

  Changed 4 weeks ago by gnu

  • blocking 7879 added

  Changed 4 weeks ago by mtd

  • cc mtd added

  Changed 3 weeks ago by gregorio

  • cc ashish, gregorio added

Hi Michael and Michailis,

Where is this code? We want the ability to turn it off (GUI for that is another story :-).

Can we get this in?

Longer battery life is one of the key features of the release and it sounds like this will make a big difference in that.

Thanks,

Greg S

  Changed 3 weeks ago by cscott

Copying from email from mbletsas:

Given that such a modification requires extensive driver patching, I don't think we should pursue it for this release.

M.

and dsaxena (responding to michalis):

+1

I must be missing something, b/c shouldn't the mesh just be disabled
until we do "ifconfig msh0 up" and then disabled via "ifconfig msh0 down"?
I'm not sure why we need a bunch of iwpriv calls for this.

~Deepak

}}}

follow-up: ↓ 17   Changed 3 weeks ago by dsaxena

Specifically:

iwpriv msh0 mesh_start "<channel> <mesh id>"

ifconfig msh0 up
iwconfig msh0 essid <id>
iwconfig msh0 channel <channel>

Mesh start command without any parameter will start the mesh on channel 1 with “olpc-mesh” as a mesh ID $> iwpriv msh0 mesh_start

ifconfig msh0 up

Mesh Stop: This will stop the mesh $> iwpriv msh0 mesh_stop

ifconfig msh0 down

Mesh Status: This will retune the status of the mesh. If mesh is on then it will display the channel and the mesh ID. $> iwpriv msh0 mesh status

iwconfig msh0
ifconfig msh0

What other data/control do we need that is not provided by the standard interfaces?

in reply to: ↑ 16 ; follow-up: ↓ 19   Changed 3 weeks ago by mbletsas

Deepak,

By design, the mesh forwarding engine always runs, regardless of the status of the kernel networking device. So when you issue an "ifconfig msh0 down" command, all that you are doing is telling the host to stop using the mesh interface and not to turn off the mesh forwarding engine in the firmware.

In general, this seems to confuse the hell out of linux people conceptually, since they believe that ifconfig should really turn things on and off. In our case msh0 is only the means by which the host communicates with the communications processor on the wireless subsystem and the linux kernel interface semantics are just not enough for the host processor to exercise control over its partner.

M.

  Changed 3 weeks ago by dwmw2

I agree that it makes sense to leave the mesh running when the interface is logically down. I don't find that confusing at all.

When the mesh is disabled completely by echo 0 > /sys/class/net/eth0/lbs_mesh, however, we ought to actually disable it as requested. It's fairly trivial to do that in the driver -- just add the necessary firmware command in the existing lbs_remove_mesh() function. And the corresponding command to re-enable it in lbs_add_mesh(), of course.

in reply to: ↑ 17   Changed 3 weeks ago by dsaxena

Replying to mbletsas:

Deepak, By design, the mesh forwarding engine always runs, regardless of the status of the kernel networking device. So when you issue an "ifconfig msh0 down" command, all that you are doing is telling the host to stop using the mesh interface and not to turn off the mesh forwarding engine in the firmware. In general, this seems to confuse the hell out of linux people conceptually, since they believe that ifconfig should really turn things on and off. In our case msh0 is only the means by which the host communicates with the communications processor on the wireless subsystem and the linux kernel interface semantics are just not enough for the host processor to exercise control over its partner.

Thanks for the clarification. Makes much more sense now.

  Changed 3 weeks ago by cjb

  • cc carrano added

Could someone (Ricardo, Javier, or Deepak?) summarize the situation here? This one's listed as a proposed blocker.

  Changed 3 weeks ago by mbletsas

  • type changed from defect to enhancement

It is definitely not a blocker and it shouldn't be characterized as a bug either given that the wireless interface behaves the way we intended it to. Given that we don't have any easy (GUI) way to control wireless parameters yet, this should be classified as a future enhancement with a dependency on the GUI control panel .

  Changed 3 weeks ago by gnu

I agree it doesn't block 8.2.0.

We have 9-month-old firmware and driver that we haven't put into a release, and we have an Eben design for a GUI for turning mesh on/off that we didn't implement either. The GUI was not implemented because the driver and firmware couldn't do it. Now the driver and firmware are not being implemented because we don't have a GUI?

I love this project. Especially at release-crunch time.

These firmware improvements may already be included in 8.2.0 (due to riding along with other firmware stuff we needed). joyride-2301 ships with firmware 5.110.22.p17, which is much newer than the 5.110.21.X offered above. Thus I think we have the firmware improvements already, though it'd be nice if someone who actually maintains this would say so.

7 months ago, supposedly the driver improvements were also made. ethtool -i shows that joyride-2301 has driver version COMM-USB8388-323-p0-dbg. Does that mean we have this feature in the driver, or not?

I think what we're lacking to close this out is the iwpriv command (and documentation on the ioctls so that the Frame, ohm, and/or Network Manager can invoke the new capabilities). In other words, we are not only 90% done, we are 98% done. I think.

  Changed 3 weeks ago by mstone

  • keywords blocks-:8.2.0 added; blocks?:8.2.0 removed
  • next_action changed from never set to code

  Changed 9 days ago by gregorio

  • owner changed from rchokshi to jcardona
  • type changed from enhancement to defect

  Changed 9 days ago by mbletsas

  • summary changed from [firmware] Marvell firmware should default to mesh off, and have controls for turning mesh on and off to [firmware] Marvell firmware should have controls for turning mesh forwarding on and off

follow-ups: ↓ 27 ↓ 28   Changed 9 days ago by mbletsas

  • cc dwmw2 removed

I have changed the title of this to reflect the original requirement for explicit control of mesh forwarding. I don't agree with the requirement that mesh forwarding should default to off.

in reply to: ↑ 26 ; follow-up: ↓ 29   Changed 8 days ago by jcardona

  • next_action changed from code to communicate

Replying to mbletsas:

I have changed the title of this to reflect the original requirement for explicit control of mesh forwarding. I don't agree with the requirement that mesh forwarding should default to off.

There is currently a way turn off the mesh via sysfs:

bash-3.2# echo 0 > /sys/class/net/eth1/lbs_mesh 
bash-3.2# ifconfig msh0
msh0: error fetching interface information: Device not found
bash-3.2# echo 1 > /sys/class/net/eth1/lbs_mesh 
bash-3.2# ifconfig msh0
msh0      Link encap:Ethernet  HWaddr 00:17:C4:05:XX:XX  
          BROADCAST MULTICAST  MTU:1500  Metric:1
          RX packets:4 errors:0 dropped:0 overruns:0 frame:0
          TX packets:18 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:1000 
          RX bytes:168 (168.0 b)  TX bytes:1424 (1.3 KiB)

This not only brings down the msh0 but also sends the mesh stop command to the firmware. If this is all we need to meet this requirement, I will confirm that forwarding is indeed disabled.

in reply to: ↑ 26   Changed 8 days ago by mtd

Replying to mbletsas:

I don't agree with the requirement that mesh forwarding should default to off.

Is there a way a laptop owner can do this (default to off)? I have a buggy access point and I'd rather not bug it (luckily, the AP survives but sends something dodgy causing the ADSL modem + router to reboot[1]) if I can avoid it.

Martin

1. Just so you know I'm not making this up :) - I've a NetGear and a DrayTek Vigor 2600, and this is how the Vigor (adsl) bites the dust after the NetGear (:e0:a1) spazzes out: Jan 1 00:22:50 217.155.220.158 adsl: PoE <== Protocol:LCP(c021) EchoRep Identifier:0x17Magic Number: 0x1764 27 6b ## Jan 1 00:23:15 217.155.220.158 adsl: WLAN_DBG - EAPoL_handler, from 0:17:f2:3f:e0:a1 Aug 29 00:19:17 cree kernel: eth0: link down.

Unfortunately a few others have had this type of problem ( http://forums.whirlpool.net.au/forum-replies-archive.cfm/902842.html ), but it only happens for me when the XO is around.

in reply to: ↑ 27   Changed 7 days ago by jcardona

  • keywords joyride-2346:+ added
  • next_action changed from communicate to qa signoff

Replying to self:

If this is all we need to meet this requirement, I will confirm that forwarding is indeed disabled.

Went ahead and confirmed that this is still working in the firmware release (5.110.22.p18) that ships with joyride-2356. After the mesh interface is turned off via sysfs, the wireless module:

  1. stops beaconing
  2. stops responding to probe requests
  3. stops responding to path requests
  4. stops forwarding traffic

In other words, the mesh is completely turned off.

  Changed 7 days ago by mbletsas

Javier's explanation and resolution is completely satisfactory.

M.

Note: See TracTickets for help on using tickets.