Branch: refs/heads/pmtud-fixes
Home:
https://github.com/kronosnet/kronosnet
Commit: 434299300a2f23acd96f2f287939549ab2944411
https://github.com/kronosnet/kronosnet/commit/434299300a2f23acd96f2f2879395…
Author: Fabio M. Di Nitto <fdinitto(a)redhat.com>
Date: 2019-08-13 (Tue, 13 Aug 2019)
Changed paths:
M libknet/internals.h
M libknet/links.c
M libknet/links.h
M libknet/threads_pmtud.c
Log Message:
-----------
[PMTUd] add dynamic pong timeout when using crypto
problem originally reported by proxmox community, users
observed that under pressure the MTU would flap back and forth
between 2 values due to other node response timeout.
implement a dynamic timeout multiplier when using crypto that
should solve the problem in a more flexible fashion.
When a timeout hits, those new logs will show:
[knet]: [info] host: host: 1 (passive) best link: 0 (pri: 0)
[knet]: [debug] pmtud: Starting PMTUD for host: 1 link: 0
[knet]: [debug] pmtud: Increasing PMTUd response timeout multiplier to (4) for host 1
link: 0
[knet]: [info] pmtud: PMTUD link change for host: 1 link: 0 from 469 to 65429
[knet]: [debug] pmtud: PMTUD completed for host: 1 link: 0 current link mtu: 65429
[knet]: [info] pmtud: Global data MTU changed to: 65429
[knet]: [debug] pmtud: Starting PMTUD for host: 1 link: 0
[knet]: [debug] pmtud: Increasing PMTUd response timeout multiplier to (8) for host 1
link: 0
[knet]: [debug] pmtud: Increasing PMTUd response timeout multiplier to (16) for host 1
link: 0
[knet]: [debug] pmtud: Increasing PMTUd response timeout multiplier to (32) for host 1
link: 0
[knet]: [debug] pmtud: Increasing PMTUd response timeout multiplier to (64) for host 1
link: 0
[knet]: [debug] pmtud: PMTUD completed for host: 1 link: 0 current link mtu: 65429
[knet]: [debug] pmtud: Starting PMTUD for host: 1 link: 0
[knet]: [debug] pmtud: Increasing PMTUd response timeout multiplier to (128) for host 1
link: 0
[knet]: [debug] pmtud: PMTUD completed for host: 1 link: 0 current link mtu: 65429
and when the latency reduces and it is safe to be more responsive again:
[knet]: [debug] pmtud: Starting PMTUD for host: 1 link: 0
[knet]: [debug] pmtud: Decreasing PMTUd response timeout multiplier to (64) for host 1
link: 0
[knet]: [debug] pmtud: PMTUD completed for host: 1 link: 0 current link mtu: 65429
....
testing this patch on normal hosts is a bit challenging tho.
Patch was tested by hardcoding a super low timeout here:
diff --git a/libknet/threads_pmtud.c b/libknet/threads_pmtud.c
index 4f0ba0f..5e2b89b 100644
--- a/libknet/threads_pmtud.c
+++ b/libknet/threads_pmtud.c
@@ -261,7 +271,8 @@ retry:
/*
* crypto, under pressure, is a royal PITA
*/
- pong_timeout_adj_tmp = dst_link->pong_timeout_adj * 2;
+ //pong_timeout_adj_tmp = dst_link->pong_timeout_adj *
dst_link->pmtud_crypto_timeout_multiplier;
+ pong_timeout_adj_tmp = 30 *
dst_link->pmtud_crypto_timeout_multiplier;
} else {
pong_timeout_adj_tmp = dst_link->pong_timeout_adj;
}
and using a long running version of api_knet_send_crypto_test with a short PMTUd setfreq
(10 sec).
Signed-off-by: Fabio M. Di Nitto <fdinitto(a)redhat.com>