Branch: refs/heads/mtu-lock
Home:
https://github.com/kronosnet/kronosnet
Commit: 718d1e80b56677a1cffe28be2deb25f994062cbe
https://github.com/kronosnet/kronosnet/commit/718d1e80b56677a1cffe28be2deb2…
Author: Fabio M. Di Nitto <fdinitto(a)redhat.com>
Date: 2017-12-17 (Sun, 17 Dec 2017)
Changed paths:
M libknet/handle.c
M libknet/host.c
M libknet/internals.h
M libknet/links.c
M libknet/logging.c
M libknet/tests/Makefile.am
M libknet/threads_common.c
M libknet/threads_common.h
M libknet/threads_dsthandler.c
M libknet/threads_pmtud.c
M libknet/transport_sctp.c
M libknet/transports.c
Log Message:
-----------
[PMTUd] fix external API and PMTUd interaction
The problem:
PMTUd can take a long time to release the global read lock, mostly due
to the pthread_cond_timedwait required to ack/nack packets from the
other hosts. This delay could block any wrlock operation for several seconds
if not more.
The solution:
each call to the global pthread_rwlock_wrlock has been changed to a wrapper
that will notify the PMTUd to interrupt its operations (and restart) first,
then get a global write lock that is queued as soon as PMTUd is going out.
This solution also improves a lot shutdown speed.
How to test:
This is not super simple to test and verify. I used 2 VMs with known MTU of
1500. Start knet_bench on both (normal ping_data -C is more than enough).
Once they have established data exchange, change the MTU on one of the nodes
to 1600 (or higher). This should guarantee that the PMTUd process will take
a very long time to complete.
First verify that the PMTUd process takes several seconds.
Once the next PMTUd run starts, hit ctrl+c on the node that is executing
the PMTUd and the process should exit much faster than before this patch.
Signed-off-by: Fabio M. Di Nitto <fdinitto(a)redhat.com>