Branch: refs/heads/fixes Home: https://github.com/kronosnet/kronosnet Commit: 06e628fb797e1ef592dcb882ffbd524318ea8f81 https://github.com/kronosnet/kronosnet/commit/06e628fb797e1ef592dcb882ffbd52... Author: Fabio M. Di Nitto fdinitto@redhat.com Date: 2018-01-08 (Mon, 08 Jan 2018)
Changed paths: M libknet/threads_pmtud.c
Log Message: ----------- [PMTUd] drop (now) unnecessary and dangerous usleep
prior to all threads being able to notify PMTUd of EMSGSIZE errors, we had this random usleep in there to have time to collect data. It was working at the time, but it's a bad idea.
On super large clusters (>66 nodes) with 4 links on each node, when applying heavy load (cpghum on all nodes at once), the average latency between nodes can increase so much that the PMTUd thread usleep could literally block corosync for seconds at a time.
Drop the usleep and live happily ever after
Signed-off-by: Fabio M. Di Nitto fdinitto@redhat.com