Branch: refs/heads/coverity_scan Home: https://github.com/kronosnet/kronosnet Commit: 434299300a2f23acd96f2f287939549ab2944411 https://github.com/kronosnet/kronosnet/commit/434299300a2f23acd96f2f28793954... Author: Fabio M. Di Nitto fdinitto@redhat.com Date: 2019-08-13 (Tue, 13 Aug 2019)
Changed paths: M libknet/internals.h M libknet/links.c M libknet/links.h M libknet/threads_pmtud.c
Log Message: ----------- [PMTUd] add dynamic pong timeout when using crypto
problem originally reported by proxmox community, users observed that under pressure the MTU would flap back and forth between 2 values due to other node response timeout.
implement a dynamic timeout multiplier when using crypto that should solve the problem in a more flexible fashion.
When a timeout hits, those new logs will show:
[knet]: [info] host: host: 1 (passive) best link: 0 (pri: 0) [knet]: [debug] pmtud: Starting PMTUD for host: 1 link: 0 [knet]: [debug] pmtud: Increasing PMTUd response timeout multiplier to (4) for host 1 link: 0 [knet]: [info] pmtud: PMTUD link change for host: 1 link: 0 from 469 to 65429 [knet]: [debug] pmtud: PMTUD completed for host: 1 link: 0 current link mtu: 65429 [knet]: [info] pmtud: Global data MTU changed to: 65429 [knet]: [debug] pmtud: Starting PMTUD for host: 1 link: 0 [knet]: [debug] pmtud: Increasing PMTUd response timeout multiplier to (8) for host 1 link: 0 [knet]: [debug] pmtud: Increasing PMTUd response timeout multiplier to (16) for host 1 link: 0 [knet]: [debug] pmtud: Increasing PMTUd response timeout multiplier to (32) for host 1 link: 0 [knet]: [debug] pmtud: Increasing PMTUd response timeout multiplier to (64) for host 1 link: 0 [knet]: [debug] pmtud: PMTUD completed for host: 1 link: 0 current link mtu: 65429 [knet]: [debug] pmtud: Starting PMTUD for host: 1 link: 0 [knet]: [debug] pmtud: Increasing PMTUd response timeout multiplier to (128) for host 1 link: 0 [knet]: [debug] pmtud: PMTUD completed for host: 1 link: 0 current link mtu: 65429
and when the latency reduces and it is safe to be more responsive again:
[knet]: [debug] pmtud: Starting PMTUD for host: 1 link: 0 [knet]: [debug] pmtud: Decreasing PMTUd response timeout multiplier to (64) for host 1 link: 0 [knet]: [debug] pmtud: PMTUD completed for host: 1 link: 0 current link mtu: 65429
....
testing this patch on normal hosts is a bit challenging tho.
Patch was tested by hardcoding a super low timeout here:
diff --git a/libknet/threads_pmtud.c b/libknet/threads_pmtud.c index 4f0ba0f..5e2b89b 100644 --- a/libknet/threads_pmtud.c +++ b/libknet/threads_pmtud.c @@ -261,7 +271,8 @@ retry: /* * crypto, under pressure, is a royal PITA */ - pong_timeout_adj_tmp = dst_link->pong_timeout_adj * 2; + //pong_timeout_adj_tmp = dst_link->pong_timeout_adj * dst_link->pmtud_crypto_timeout_multiplier; + pong_timeout_adj_tmp = 30 * dst_link->pmtud_crypto_timeout_multiplier; } else { pong_timeout_adj_tmp = dst_link->pong_timeout_adj; }
and using a long running version of api_knet_send_crypto_test with a short PMTUd setfreq (10 sec).
Signed-off-by: Fabio M. Di Nitto fdinitto@redhat.com