[ovs-discuss] xenServer and openVswitch 1.0.99
jesse at nicira.com
Fri Feb 3 09:19:34 PST 2012
On Thu, Feb 2, 2012 at 10:06 PM, Juan Tellez <juan at embrane.com> wrote:
>>>> What's the traffic mixture like when you have this problem with vlans (i.e. single flow vs. many connections)? If you run a single stream, what is the ratio of hits to misses on the relevant datapath?
> Our traffic is varied, some very short flows others are long lasting tcp connections.
If you have many short flows, it's possible that the CPU load you see
is simply the result of normal processing.
> We are mostly concerned about the long flows dropping lots of packets. When we see messages as the above ones, can we expect that the vswitch has dropped packets?
I don't really see anything in the information that you've given that
indicates OVS is the one dropping packets.
> I think the relevant traffic is in the vif*.0 below, which is 1.1%. Can you explain hit/miss/lost statistic below?
Hits are packets processed entirely in the kernel, misses are sent to
userspace for flow setup, lost are packets that were queued to
userspace but exceeded the queue length.
> Kern.log traces are interesting .. they do seem to correlate to some of the failures we see:
> /var/log/kern.log.9.gz:Oct 2 20:55:58 localhost kernel: vif122.0: draining TX queue
> /var/log/kern.log.9.gz:Oct 2 20:56:00 localhost kernel: vif117.0: draining TX queue
> /var/log/kern.log.9.gz:Oct 2 20:56:00 localhost kernel: vif121.0: draining TX queue
> /var/log/kern.log.9.gz:Oct 2 20:56:02 localhost kernel: vif112.0: draining TX queue
> /var/log/kern.log.9.gz:Oct 2 20:56:05 localhost kernel: vif113.0: draining TX queue
> Is draining occurring on a regular interval?
Those messages are coming from netback, not OVS. Combined with the
fact that you see dropped counts going up on the interface itself, it
seems that's the likely cause of the problem. Probably something on
the guest side is not keeping up but you should talk to the Xen guys.
More information about the discuss