[ovs-dev] [optimize 11/26] netlink: Postpone choosing sequence numbers until send time.
blp at nicira.com
Wed Apr 18 09:36:40 PDT 2012
On Tue, Apr 17, 2012 at 05:21:03PM -0700, Ethan Jackson wrote:
> Does nl_sock_allocate_seq() need to take 'n' as a parameter? It's
> only caller passes in 1 currently. Perhaps a future patch will need
> It's probably worth adding a comment to nl_sock_allocate_seq()
> explaining why it wraps around at (UINT32_MAX / 2) without a deep
> understanding of netlink, it's not obvious to me why it's necessary.
> Does the theoretical race condition described in the old
> get_nlmsg_seq() function still exist? Just curious, may not be a big
> deal. If it does, maybe it makes sense to use a random initial
> sequence number instead of 1? Wouldn't solve the problem, but might
> reduce the probability of a collision.
I added a comment to handle the nl_sock_allocate_seq() for your second
/* Make it impossible for the next request for sequence numbers to wrap
* around to 0. Start over with 1 to avoid ever using a sequence number of
* 0, because the kernel uses sequence number 0 for notifications. */
I added the following to the commit message to address your other points:
Choosing sequence numbers at time of creating a packet means that
nl_sock_transact_multiple() has to search for the sequence number
of a reply, because the sequence numbers of the requests aren't
necessarily sequential. This commit makes it possible to avoid
the search, by deferring choice of sequence numbers until the
time that we send the packets. It doesn't actually modify
nl_sock_transact_multiple(), which will happen in a later commit.
Previously, I was concerned about a theoretical race condition
described in a comment in the old versino of this code:
This implementation uses sequence numbers that are unique
process-wide, to avoid a hypothetical race: send request, close
socket, open new socket that reuses the old socket's PID value,
send request on new socket, receive reply from kernel to old
socket but with same PID and sequence number. (This race could be
avoided other ways, e.g. by preventing PIDs from being quickly
However, I no longer believe that this can be a real problem,
because Netlink operates synchronously. The reply to a request
will always arrive before the socket can be closed and a new
socket opened with the old socket's PID.
More information about the dev