Hey Chrissie,
I am still having issues with the latest pmtu code:
1) using crypto, just with knet_bench ping_data test, the MTU changes on each run (I think you mentioned that on IRC, but not sure it's the same problem).
2) there is a spurious link down event on MTU change. I can reproduce this one without crypto and with ping_data test (I set it at 10 secs interval to avoid waiting minutes).
Start knet_bench on both nodes with standard MTU at 1500. At the end of each run, change MTU on the interfaces to:
- 1400 (good). - 1500 (good). - 1600 (good, mtu doesn't change) - back to 1500 (link down event) from time to time, the first MTU after the down/up event is wrong
the code is stable on heavy load, I haven't tested yet load + changing MTU at runtime. I'll do that once the basic tests are passing.
Cheers Fabio
On 12/6/2017 5:54 AM, Fabio M. Di Nitto wrote:
Hey Chrissie,
- there is a spurious link down event on MTU change. I can reproduce
this one without crypto and with ping_data test (I set it at 10 secs interval to avoid waiting minutes).
I figured this one out and it´s not very pretty.
The workaround is super simple:
diff --git a/libknet/threads_heartbeat.c b/libknet/threads_heartbeat.c index ffe2f99..0e0eea0 100644 --- a/libknet/threads_heartbeat.c +++ b/libknet/threads_heartbeat.c @@ -155,8 +155,8 @@ static void _adjust_pong_timeouts(knet_handle_t knet_h) struct knet_link *dst_link; int link_idx;
- if (pthread_rwlock_wrlock(&knet_h->global_rwlock) != 0) { - log_debug(knet_h, KNET_SUB_HEARTBEAT, "Unable to get write lock"); + if (pthread_rwlock_trywrlock(&knet_h->global_rwlock) != 0) { + log_debug(knet_h, KNET_SUB_HEARTBEAT, "_adjust_pong_timeouts: Unable to get write lock"); return; }
but the root cause is the PMTUd thread holding a read lock for very long time when we hit the pthread_cond_timedwait and the response will never come back from the other node (the reason is irrelevant).
That same read lock can block many other operations internals (for example the dstcache handler) or external (any API call that requires a write lock) for a almost unknown amount of time.
I don´t have a full solution yet. The only way I can think this could work is to have the PMTUd use a different lock set and be notified of any configuration change from the main lock.
So in theory, on each run, the PMTUd would cache link/host information with a normal read lock, release the lock, get its own internal lock, and on any relevant config changes, be notified that the cache is invalid and restart (or something along those lines...). The cache could be as simple as a timestamp from where we started the process of PMTUd and what´s recorded in memory by host/link config, we surely don´t need a copy of the whole thing.
Any ideas or suggestion are welcome.
Cheers Fabio