[ovs-discuss] xenServer and openVswitch 1.0.99

Juan Tellez juan at embrane.com
Thu Feb 2 22:06:08 PST 2012


Jesse,

>>> What's the traffic mixture like when you have this problem with vlans (i.e. single flow vs. many connections)?  If you run a single stream, what is the ratio of hits to misses on the relevant datapath?

Our traffic is varied, some very short flows others are long lasting tcp connections.  We are mostly concerned about the long flows dropping lots of packets.  When we see messages as the above ones, can we expect that the vswitch has dropped packets?  

I think the relevant traffic is in the vif*.0 below, which is 1.1%.  Can you explain hit/miss/lost statistic below?

system at xenbr2:
	lookups: frags:0, hit:110495900, missed:1284600, lost:34
	port 0: xenbr2 (internal)
	port 1: eth2
	port 2: xapi21 (internal)
	port 3: vif455.1
	port 4: vif462.0
	port 5: vif456.0
	port 6: vif461.0
	port 7: vif457.0
	port 8: vif458.0
	port 9: vif465.0
	port 10: vif459.0
	port 11: vif467.0
	port 12: vif460.0
	port 13: vif463.0
	port 14: vif464.0
	port 16: vif466.0

We are, attempting to push the host by sending lots of small flows in order to see if we can reproduce the problem in our system more easily.   

>>> Is there anything interesting in the ovs-vswitchd log?

No.  They are empty for the times we are interested in.

Kern.log traces are interesting .. they do seem to correlate to some of the failures we see:

/var/log/kern.log.9.gz:Oct  2 20:55:58 localhost kernel: vif122.0: draining TX queue
/var/log/kern.log.9.gz:Oct  2 20:56:00 localhost kernel: vif117.0: draining TX queue
/var/log/kern.log.9.gz:Oct  2 20:56:00 localhost kernel: vif121.0: draining TX queue
/var/log/kern.log.9.gz:Oct  2 20:56:02 localhost kernel: vif112.0: draining TX queue
/var/log/kern.log.9.gz:Oct  2 20:56:05 localhost kernel: vif113.0: draining TX queue

Is draining occurring on a regular interval?

Thanks,

Juan Tellez



-----Original Message-----
From: Jesse Gross [mailto:jesse at nicira.com] 
Sent: Thursday, February 02, 2012 6:36 PM
To: Juan Tellez
Cc: discuss at openvswitch.org; Vijay Chander
Subject: Re: [ovs-discuss] xenServer and openVswitch 1.0.99

On Wed, Feb 1, 2012 at 6:07 PM, Juan Tellez <juan at embrane.com> wrote:
> Jesse,
>
> Dmesg hasn't changed for a while .. and sadly it is not time-stamped.  Below is the tail:
>
> ..
> device vif467.1 entered promiscuous mode
> device tap467.0 entered promiscuous mode
> device tap467.1 entered promiscuous mode
> /local/domain/465/device/vif/0: Connected
> /local/domain/465/device/vif/1: Connected
> /local/domain/466/device/vif/0: Connected
> /local/domain/466/device/vif/1: Connected
> /local/domain/467/device/vif/0: Connected
> /local/domain/467/device/vif/1: Connected
> vif458.2: draining TX queue
> vif456.2: draining TX queue
> vif457.2: draining TX queue
> vif459.2: draining TX queue
>
>> What are the outputs of dmesg and ovs-dpctl show?

What's the traffic mixture like when you have this problem with vlans
(i.e. single flow vs. many connections)?  If you run a single stream,
what is the ratio of hits to misses on the relevant datapath?

Is there anything interesting in the ovs-vswitchd log?


More information about the discuss mailing list