[ovs-discuss] Network get unstable, put the whole system on halt

Salman Toor salman.toor at helsinki.fi
Mon May 20 03:00:55 PDT 2013


Hi, 

We are using openvswitch with Openstack Quantum module. Following are the details of our system.. 

Openstack Controller and Compute nodes are running with Ubuntu 12.04.5, kernel 3.5 and OpenVSwitch version 1.4.0+build0 with GRE tunnels.  

The problem is with very little activity everything works very fine but as we started to increase the load on the system the kernel log started to grow on the controller and fill the entire disk space and halt the complete system. And it happen within 2 to 3 hours ... 

Most of the log is filled with the following messages 


May 20 06:31:04 ukko233-cern-controller kernel: [67607.598519] Call Trace:
May 20 06:31:04 ukko233-cern-controller kernel: [67607.598519]  <IRQ> [<ffffffff81052c9f>] warn_slowpath_common+0x7f/0xc0
May 20 06:31:04 ukko233-cern-controller kernel: [67607.598524]  [<ffffffff81052d96>] warn_slowpath_fmt+0x46/0x50
May 20 06:31:04 ukko233-cern-controller kernel: [67607.598526]  [<ffffffff8157501b>] ? skb_release_data.part.47+0xcb/0x110
May 20 06:31:04 ukko233-cern-controller kernel: [67607.598528]  [<ffffffff8169abd0>] skb_warn_bad_offload+0xbe/0xc9
May 20 06:31:04 ukko233-cern-controller kernel: [67607.598531]  [<ffffffff8157f396>] skb_gso_segment+0x246/0x2c0
May 20 06:31:04 ukko233-cern-controller kernel: [67607.598536]  [<ffffffffa03dd02f>] ovs_tnl_send+0x1ef/0xc90 [openvswitch]
May 20 06:31:04 ukko233-cern-controller kernel: [67607.598539]  [<ffffffff8169e7de>] ? _raw_spin_lock+0xe/0x20
May 20 06:31:04 ukko233-cern-controller kernel: [67607.598541]  [<ffffffff810e0001>] ? kdb_bc+0x191/0x240
May 20 06:31:04 ukko233-cern-controller kernel: [67607.598544]  [<ffffffff810e4fe4>] ? handle_edge_irq+0x94/0x130
May 20 06:31:04 ukko233-cern-controller kernel: [67607.598552]  [<ffffffffa03de52e>] ovs_vport_send+0x1e/0x50 [openvswitch]
May 20 06:31:04 ukko233-cern-controller kernel: [67607.598561]  [<ffffffffa03d5552>] do_execute_actions+0x3e2/0x790 [openvswitch]
May 20 06:31:04 ukko233-cern-controller kernel: [67607.598570]  [<ffffffffa03d5968>] ovs_execute_actions+0x68/0x110 [openvswitch]
May 20 06:31:04 ukko233-cern-controller kernel: [67607.598579]  [<ffffffffa03d802e>] ovs_dp_process_received_packet+0x6e/0x150 [openvswitch]
May 20 06:31:04 ukko233-cern-controller kernel: [67607.598589]  [<ffffffffa03de4ff>] ovs_vport_receive+0x5f/0x70 [openvswitch]
May 20 06:31:04 ukko233-cern-controller kernel: [67607.598595]  [<ffffffffa03e0e07>] patch_send+0x27/0x50 [openvswitch]
May 20 06:31:04 ukko233-cern-controller kernel: [67607.598599]  [<ffffffffa03de52e>] ovs_vport_send+0x1e/0x50 [openvswitch]
May 20 06:31:04 ukko233-cern-controller kernel: [67607.598603]  [<ffffffffa03d5552>] do_execute_actions+0x3e2/0x790 [openvswitch]
May 20 06:31:04 ukko233-cern-controller kernel: [67607.598607]  [<ffffffffa03de52e>] ? ovs_vport_send+0x1e/0x50 [openvswitch]
May 20 06:31:04 ukko233-cern-controller kernel: [67607.598610]  [<ffffffffa03d5552>] ? do_execute_actions+0x3e2/0x790 [openvswitch]
May 20 06:31:04 ukko233-cern-controller kernel: [67607.598613]  [<ffffffffa03d5968>] ovs_execute_actions+0x68/0x110 [openvswitch]
May 20 06:31:04 ukko233-cern-controller kernel: [67607.598617]  [<ffffffffa03d802e>] ovs_dp_process_received_packet+0x6e/0x150 [openvswitch]
May 20 06:31:04 ukko233-cern-controller kernel: [67607.598621]  [<ffffffffa045b9b2>] ? tcp_in_window+0x342/0x5e0 [nf_conntrack]
May 20 06:31:04 ukko233-cern-controller kernel: [67607.598626]  [<ffffffffa03de4ff>] ovs_vport_receive+0x5f/0x70 [openvswitch]
May 20 06:31:04 ukko233-cern-controller kernel: [67607.598630]  [<ffffffffa03e0143>] internal_dev_xmit+0x23/0x30 [openvswitch]
May 20 06:31:04 ukko233-cern-controller kernel: [67607.598632]  [<ffffffff815848b6>] dev_hard_start_xmit+0x256/0x550
May 20 06:31:04 ukko233-cern-controller kernel: [67607.598634]  [<ffffffff81584e7c>] dev_queue_xmit+0x2cc/0x470
May 20 06:31:04 ukko233-cern-controller kernel: [67607.598637]  [<ffffffff8159f87a>] ? eth_header+0x3a/0xf0
May 20 06:31:04 ukko233-cern-controller kernel: [67607.598640]  [<ffffffff8158c832>] neigh_resolve_output+0x122/0x210
May 20 06:31:04 ukko233-cern-controller kernel: [67607.598642]  [<ffffffff815adf85>] ? nf_hook_slow+0x75/0x150
May 20 06:31:04 ukko233-cern-controller kernel: [67607.598644]  [<ffffffff815ba840>] ? ip_fragment+0x810/0x810
May 20 06:31:04 ukko233-cern-controller kernel: [67607.598646]  [<ffffffff815ba9be>] ip_finish_output+0x17e/0x2d0
May 20 06:31:04 ukko233-cern-controller kernel: [67607.598648]  [<ffffffff815bb4a6>] ip_output+0x66/0xa0
May 20 06:31:04 ukko233-cern-controller kernel: [67607.598650]  [<ffffffff815b58d0>] ? inet_del_protocol+0x40/0x40
May 20 06:31:04 ukko233-cern-controller kernel: [67607.598653]  [<ffffffff815b7689>] ip_forward_finish+0x69/0x80
May 20 06:31:04 ukko233-cern-controller kernel: [67607.598655]  [<ffffffff815b7931>] ip_forward+0x291/0x3e0
May 20 06:31:04 ukko233-cern-controller kernel: [67607.598657]  [<ffffffff815b59dd>] ip_rcv_finish+0x10d/0x370
May 20 06:31:04 ukko233-cern-controller kernel: [67607.598660]  [<ffffffff815b6291>] ip_rcv+0x201/0x300
May 20 06:31:04 ukko233-cern-controller kernel: [67607.598662]  [<ffffffff81582a13>] ? netif_receive_skb+0x23/0x90
May 20 06:31:04 ukko233-cern-controller kernel: [67607.598664]  [<ffffffff81582576>] __netif_receive_skb+0x4c6/0x540
May 20 06:31:04 ukko233-cern-controller kernel: [67607.598666]  [<ffffffff815835c1>] process_backlog+0xb1/0x190
May 20 06:31:04 ukko233-cern-controller kernel: [67607.598668]  [<ffffffff815832f4>] net_rx_action+0x134/0x240
May 20 06:31:04 ukko233-cern-controller kernel: [67607.598671]  [<ffffffff8105ba88>] __do_softirq+0xa8/0x210
May 20 06:31:04 ukko233-cern-controller kernel: [67607.598679]  [<ffffffff8169e7de>] ? _raw_spin_lock+0xe/0x20
May 20 06:31:04 ukko233-cern-controller kernel: [67607.598686]  [<ffffffff816a841c>] call_softirq+0x1c/0x30
May 20 06:31:04 ukko233-cern-controller kernel: [67607.598694]  [<ffffffff81016245>] do_softirq+0x65/0xa0
May 20 06:31:04 ukko233-cern-controller kernel: [67607.598702]  [<ffffffff8105be6e>] irq_exit+0x8e/0xb0
May 20 06:31:04 ukko233-cern-controller kernel: [67607.598710]  [<ffffffff816a8c73>] do_IRQ+0x63/0xe0
May 20 06:31:04 ukko233-cern-controller kernel: [67607.598715]  [<ffffffff8169ec6a>] common_interrupt+0x6a/0x6a
May 20 06:31:04 ukko233-cern-controller kernel: [67607.598716]  <EOI>
May 20 06:31:04 ukko233-cern-controller kernel: [67607.598717] ---[ end trace 4ed1c8725cfe8f94 ]---
May 20 06:31:04 ukko233-cern-controller kernel: [67607.598733] ------------[ cut here ]------------
May 20 06:31:04 ukko233-cern-controller kernel: [67607.598736] WARNING: at /build/buildd/linux-lts-quantal-3.5.0/net/core/dev.c:1904 skb_warn_bad_offload+0xbe/0xc9()
May 20 06:31:04 ukko233-cern-controller kernel: [67607.598737] Hardware name: PowerEdge M610
May 20 06:31:04 ukko233-cern-controller kernel: [67607.598738] : caps=(0x00000000400158e9, 0x0000000000000000) len=2856 data_len=1402 gso_size=1402 gso_type=1 ip_summed=1
May 20 06:31:04 ukko233-cern-controller kernel: [67607.598739] Modules linked in: 8021q garp xt_conntrack ipt_REDIRECT ip6table_filter ip6_tables ebtable_nat ebtables ipt_MASQUERADE xt_state ipt_REJECT xt_CHECKSUM bridge stp llc xt_tcpudp iptable_filter iptable_mangle iptable_nat nf_nat vesafb nf_conntrack_ipv4 nf_conntrack nf_defrag_ipv4 ip_tables x_tables openvswitch(O) iscsi_trgt(O) nfsd nfs lockd fscache auth_rpcgss nfs_acl sunrpc ib_iser ext2 rdma_cm ib_cm iw_cm ib_sa ib_mad ib_core ib_addr iscsi_tcp libiscsi_tcp libiscsi scsi_transport_iscsi gpio_ich coretemp kvm_intel kvm dcdbas microcode wmi acpi_power_meter lpc_ich joydev ioatdma dca i7core_edac edac_core mac_hid lp parport hid_generic usbhid hid usb_storage uas mptsas mptscsih mptbase scsi_transport_sas bnx2x libcrc32c mdio bnx2

Since the log is mostly filled with the OVS messages I thought it will be better to ask at the OVS forum. 

We are wondering does anybody else have experience this? Are we using some wrong version of OVS with the kernel ?

Or any hint which can help us to fix this problem. 

Regards.
Salman. 




-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://openvswitch.org/pipermail/discuss/attachments/20130520/b2459ef2/attachment-0001.htm>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: smime.p7s
Type: application/pkcs7-signature
Size: 4444 bytes
Desc: not available
URL: <http://openvswitch.org/pipermail/discuss/attachments/20130520/b2459ef2/attachment-0001.bin>


More information about the discuss mailing list