Branch: refs/heads/mtu-lock Home: https://github.com/kronosnet/kronosnet Commit: 718d1e80b56677a1cffe28be2deb25f994062cbe https://github.com/kronosnet/kronosnet/commit/718d1e80b56677a1cffe28be2deb25... Author: Fabio M. Di Nitto fdinitto@redhat.com Date: 2017-12-17 (Sun, 17 Dec 2017)
Changed paths: M libknet/handle.c M libknet/host.c M libknet/internals.h M libknet/links.c M libknet/logging.c M libknet/tests/Makefile.am M libknet/threads_common.c M libknet/threads_common.h M libknet/threads_dsthandler.c M libknet/threads_pmtud.c M libknet/transport_sctp.c M libknet/transports.c
Log Message: ----------- [PMTUd] fix external API and PMTUd interaction
The problem:
PMTUd can take a long time to release the global read lock, mostly due to the pthread_cond_timedwait required to ack/nack packets from the other hosts. This delay could block any wrlock operation for several seconds if not more.
The solution:
each call to the global pthread_rwlock_wrlock has been changed to a wrapper that will notify the PMTUd to interrupt its operations (and restart) first, then get a global write lock that is queued as soon as PMTUd is going out.
This solution also improves a lot shutdown speed.
How to test:
This is not super simple to test and verify. I used 2 VMs with known MTU of 1500. Start knet_bench on both (normal ping_data -C is more than enough). Once they have established data exchange, change the MTU on one of the nodes to 1600 (or higher). This should guarantee that the PMTUd process will take a very long time to complete. First verify that the PMTUd process takes several seconds. Once the next PMTUd run starts, hit ctrl+c on the node that is executing the PMTUd and the process should exit much faster than before this patch.
Signed-off-by: Fabio M. Di Nitto fdinitto@redhat.com