On 05/15/2016 07:42 AM, Fabio M. Di Nitto wrote:
Hey Chrissie,
i found the bugs in totemknet that made knet so slow and now I am
getting up to 50% more cpgbench performance (19MB/sec on 64 bytes pckts)
vs udpu (13MB/sec).
First bug is the dst_host_filter. this is not your fault, your thinking
was sound but it didn't do the right thing. We need to fix the knet API
doc and possibly I can also fix the code to be smarter.
The issue is that for RX packet, you make an exception where if we are
sending pckts to self, you return that the pckt is unicast. This is
confusing the packet deduplicator because it will try to match mcast
pckts against unicast seq num.
The dst_host_filter shouldn't change unicast/mcast, but return data from
the pckt. In theory this can be simplified because those information are
already inside the onwire packet.
I'll review that soon enough because it's confusing and incorrect.
Anyway to unblock testing:
+#if 0
if (tx_rx == KNET_NOTIFY_RX) {
dst_host_ids[0] = this_host_id;
*dst_host_ids_entries = 1;
res = 0; /* already home */
}
else {
+#endif
and remove the stray } ;)
Second is the MTU calculation that's generating unnecessary fragmented
pckts.
extern void totemknet_net_mtu_adjust (void *knet_context, struct
totem_config *totem_config)
{
- fprintf(stderr, "MTU = %d\n", totem_config->net_mtu);
- totem_config->net_mtu = 1444;
+ // fabbione: need to export libknet header size from libknet somehow
+ totem_config->net_mtu -= totemip_udpip_header_size(AF_INET) + 23;
}
This calculation is correct for AF_INET interfaces (we need to fix it
for IPv6 and such but it's a start.
This also unveiled a less important bug in knet that doesn't export all
MTU information to applications, but it doesn't affect corosync directly
(that hard coded 23 should be an API call).
Using the combination of those 2 gives me a 14 to 15 MB/sec on 64 bytes
packets in cpgbench using default netmtu of 1500.
The last boost is netmtu changes. Bumping netmtu in corosync.conf to
4096 I get to 19MB/sec.
Now there is a bug in corosync netmtu somewhere. I didn't bother to
investigate because it's sunday,
I tried 8192 and corosync started to misbehave, as if it's not sending
pckts till buffer is full (or that's the feeling I got).
Bumping netmtu to 64000 makes corosync hang.
So there is definitely some work to be done to fix it and most likely
get better performances out of the system.
I also found the issue that's causing membership to take a long time to
form and random changes. It is somehow related to the PMTUd thread
that's doing something funky. I have an idea of what's the root problem
but i'll need to do proper debugging / investigation.
Fabio