[ovs-dev] proposed flow key compatibility rules
jesse at nicira.com
Fri Nov 4 16:47:36 PDT 2011
On Fri, Nov 4, 2011 at 12:22 PM, Ben Pfaff <blp at nicira.com> wrote:
> I'm also thinking about changing the flow key format by dropping the
> ordering restrictions. There's no real benefit to them unless
> anything is actually sensitive to ordering (e.g. we allow duplicate
> attributes, which my proposal below would avoid). I've already
> implemented the userspace half of this as part of another change that
> I'm working on.
In general, I agree that dropping the ordering requirements is a good
thing as it seems not very Netlink-ish. Of course the most important
part is that it doesn't introduce any ambiguity.
Somewhat related to this, I think that we probably also want to loosen
the requirements for the flow metadata that is given to the kernel.
For example, if the kernel doesn't understand tun_id but userspace
passes it then currently we will reject the flow, even though it's not
important. Since all of the metadata is optional, I think it's
probably OK to be more flexible here.
> Naively, to add VLAN support, it makes sense to add a new "vlan" flow
> key attribute to contain the VLAN tag, then continue to decode the
> encapsulated headers beyond the VLAN tag using the existing field
> definitions. With this change, an TCP packet in VLAN 10 would have a
> flow key much like this:
> But this change would negatively affect a userspace application that
> has not been updated to understand the new "vlan" flow key attribute.
> The application could, following the flow compatibility rules above,
> ignore the "vlan" attribute that it does not understand and therefore
> assume that the flow contained IP packets. This is a bad assumption
> (the flow only contains IP packets if one parses and skips over the
> 802.1Q header) and it could cause the application's behavior to change
> across kernel versions even though it follows the compatibility rules.
I wonder if this actually matters in practice. If the application
always does it's own flow extraction then it would find the vlan (or
see EtherType 0x8100) itself, regardless of what the kernel tells it.
If the kernel did not understand vlans and the application does, then
it would recognize that the kernel flow does not contain a vlan header
and therefore cannot process the flow. Conversely, if the application
does not understand vlan and the kernel does, it would never mistake
it for an IP packet. I think the actual values only matter if you are
trying to directly use the kernel flow key, which we're not talking
On the other hand, I think it's clearly better to have the meaning of
values stay constant, I just wonder whether that is always practical.
> The solution is to use a set of nested attributes. This is, for
> example, why 802.1Q support uses nested attributes. A TCP packet in
> VLAN 10 is actually expressed as:
> Notice how the encapsulated "eth_type", "ip", and "tcp" flow key
> attributes are nested inside the "vlan" attribute. Thus, an
> application that does not understand the "vlan" key will not see
> either of those attributes and therefore will not misinterpret them.
> (Also, the outer eth_type is still 0x8100, not changed to 0x0800.)
How are the vid and pcp members represented? Everything else in that
list is its own attribute. vid and pcp are part of the vlan attribute
but that has now been converted into a nested container.
In a somewhat related vein, how do we deal with actions now that they
are tied to the match structure? This is one area where I thought
that the previous model of associating the TPID with the vlan tag
worked particularly well because you want to insert a tag and keep the
If there is an invalid vlan tag, would you pass up a zeroed out
attribute similar to how we do for other protocols?
For multiple levels of tagging, I'm guessing that it would looks
something like this:
What about things like MPLS? Would we also use nested attributes as
well for that?
The final area that I was thinking about was IPv6 extension headers.
I think there's a good chance that at some point we'll want to be able
to parse some of the extension headers that we currently skip over.
We can insert these as additional attributes in between the IPv6
header and the L4 header but there are two things that I wonder about:
* Is it important for the next field in the IPv6 header to point
correctly to extension header? In the IPv6 header, the answer to this
question is yes. Currently in our fields, it points to the L4 header.
It's definitely important in the flow matching structures where
fields are aliased on top of each other but is perhaps somewhat
redundant in Netlink where the next field is implied by the following
* It's possible to have multiple extension headers of the same type
(I guess it depends on how finely you break it down but certainly you
can have multiple hop-by-hop headers). I'm not sure how important
ordering is in practice for the headers that can be repeated but in
theory you're supposed to process them in order.
More information about the dev