Re: [kronosnet/kronosnet] a2f08e: Switch over all plugins to the module system

28 Nov 2017

      On 11/28/2017 6:23 PM, Ferenc Wágner wrote:
...
"Fabio M. Di Nitto" fdinitto@redhat.com writes:
...
On 11/28/2017 10:00 AM, Ferenc Wágner wrote:
...
"Fabio M. Di Nitto" fdinitto@redhat.com writes:
...
let me stop you a minute here.
Thanks for looking at this experiment proactively!  It reached the point
of viability right now, so let's chat about them if you have the time.
Yeps, I have got time :-)
Apparently more than me...  I'm at home with a sick child, sorry for the
slow reaction.
No worries. I have got 2 kids as well. Family first.
...
...
...
...
what is the end goal of this work?
I tried to summarize it in 96a92847:
Our current practice of dlopening foreign shared libraries is problematic
for several reasons:
* not portable: modules and shared libraries can be different object types
* dependency information is invisible (our canaries mostly solve this)
* hardwiring SONAMES breaks on transitions (KNET_PKG_SONAME solves this)
* symbol versioning information is lost (theoretically solvable)

The preferred way out is generating dynamically loaded private modules
from the main source, which then rely on the dynamic linker to load the
external symbols as usual.

For a longer version please refer to my mail at
https://lists.debian.org/debian-mentors/2017/11/msg00200.html and the
links included by Guillem Jover.
Most of those concerns appears to be coming from a packing perspective.
I understand them, and some of them are valid.
I think it's a sharp observation.  While knet is evolving faster than
all the plugin dependencies taken together, keeping dlopen at the outer
boundary can be advantageous.
At the end, if any of the external API will change, we will notice one
way or another.
That is why we have the daily CI job running _exactly_ to detect
breakage generated by our build dependencies.
https://ci.kronosnet.org/job/knet-all-daily/
https://docs.google.com/document/d/1q6OZD97H8ZF1WEDTJq6p84dR-6AuL_HrxDLssSdl...
...
...
...
...
We originally didn´t implement the modules model because it has
several downsides for little benefit, based on the experience we had
with corosync modules in the past.
I'd be interested to learn about these downsides, could you please
provide some hints or keywords to search for?
I don´t think we have recorded them, but here is the rundown:

technically speaking using modules is no different than dlopening a

shared library.
Yes, on ELF platforms that's true.
...
In that respect, we are simply moving the problem somewhere else. I
could agree (based on the threads above) that containing the problem
within the same upstream is probably saner.
It's the only way I know of to avoid hardwiring foreign library sonames.
Which differ from platform to platform.  Detecting them (as we currently
do) works well enough in simple cases, but NSS is a pig already, having
its symbols spread into several shared objects.  That would be rather
complicated to emulate with dlopens, not to mention symbol versioning.
Honestly, I haven't had the courage to dwelve into that yet.
See my last email. Let´s move to module and solve the problem at once.
...
...

it enforces a strict internal API/ABI between main libknet and the

plugins. Making changes to those very complex, specially during
updates (see also point 4). Right now we want to keep those API/ABI
free to change and expand.
I'd think keeping both sides in the same upstream makes the internal
API/ABI a non-issue at the upstream level.  Packaging can use strict
versioned dependencies between the plugin packages and the main library
to force upgrading them together (this only exposes the need to restart
the application after library/plugin upgrades).
Packaging only helps you to maintain ondisk compatibility between
version tho. the main library might be loaded by an application and not
restarted on package upgrade.
...
Other solutions are
module versioning (like library versioning, but independent) or plugin
directory versioning, which enable coexistence of different module ABI
versions (again, the module ABI could change keeping the library ABI
undisturbed).  See also below.
right, same as I suggested in the other email.
...
...

it´s very difficult to debug modules due the symbol resolving

mechanism (Honza in CC can provide more details).
For what it's worth, gdb works fine for me...
wferi@lant:~/ha/kronosnet/kronosnet/libknet/tests$ LD_LIBRARY_PATH=../.libs gdb .libs/int_crypto_test
[...]
Reading symbols from .libs/int_crypto_test...done.
(gdb) b encrypt_nss
Function "encrypt_nss" not defined.
Make breakpoint pending on future shared library load? (y or [n]) y
Breakpoint 1 (encrypt_nss) pending.
(gdb) r
Starting program: /home/wferi/ha/kronosnet/kronosnet/libknet/tests/.libs/int_crypto_test 
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib/x86_64-linux-gnu/libthread_db.so.1".
[New Thread 0x7ffff5054700 (LWP 1856)]
[New Thread 0x7ffff4853700 (LWP 1857)]
[New Thread 0x7ffff4052700 (LWP 1858)]
[New Thread 0x7ffff3851700 (LWP 1859)]
[New Thread 0x7ffff3050700 (LWP 1860)]
[New Thread 0x7ffff284f700 (LWP 1861)]
[New Thread 0x7ffff204e700 (LWP 1862)]
Test knet_handle_crypto with nss/aes128/sha1 and normal key
knet logs: [info] common: crypto_nss.so has been loaded from /home/wferi/ha/kronosnet/kronosnet/libknet/tests/../.libs/crypto_nss.so
knet logs: [debug] crypto: Initizializing crypto module [nss/aes128/sha1]
knet logs: [debug] nsscrypto: Initizializing nss crypto module [aes128/sha1]
knet logs: [debug] crypto: security network overhead: 68
Source Data: Encrypt me!
Thread 1 "int_crypto_test" hit Breakpoint 1, nsscrypto_encrypt_and_signv (iovcnt_in=1, 
    buf_out_len=0x7fffffffd0b0, buf_out=0x5555579090e0 "", iov_in=0x7fffffffd030, 
    knet_h=0x7ffff5055010) at crypto_nss.c:625
625			if (encrypt_nss(knet_h, iov_in, iovcnt_in, buf_out, buf_out_len) < 0) {
(gdb)
Let´s see how it goes. Apparently the plugin module used in corosync was
much more complex and caused problems that theoretically we should not
see here (according to Jan).
...
...
#3 i am not happy about. specially given that the project is still in
it´s "early" days. this specifically has proven very challenging
in some environments.
Nothing threatening comes to my mind at the moment, but I'd be happy to
look into any concrete problems brought up.
agreed.
...
...

runtime operations (updates) can be nasty. You could have

application X that has loaded libknet 1.0 with modules 1.0
apt-get update.. get libknet 1.1 or whatever, new plugins,
application tries to load a module and kaboom.
Well, yes.  But at least it's in one hand: with modules you've got a
small internal ABI to watch out for, with direct dlopens you've got
several foreign ABIs to follow.  Yes, development speed matters, but as
knet matures, the balance will inevitably tip, if it hasn't already.
...
#4 can be solved by adding some kind of hashing/signing mechanism of
the modules (aka load only modules that match the build).
That sounds somewhat overkill... why not just use a version number if we
really must?  Or introduce an extensible module ABI (with an explicit
size at the front of the model definition) and stay safe for a longer
term?  Just thinking out loudly...
Yeah we got to the same conclusion. See again my other reply :-)
...
...
As for the code I have seen so far, please find another way to pass
log_msg down to the plugins. I am not going to export it to the world :-)
Yeah, that's a wart.  I don't know how to make that symbol accessible in
the usual way in the modules without including its code.  And I wonder
what other internal symbols may need to be exposed later.  Does the
format used in the log pipe constitute ABI?  This might even be a show
stopper.
It´s no different than installing callback. See for example:
int knet_handle_enable_pmtud_notify(knet_handle_t knet_h,
                                    void *pmtud_notify_fn_private_data,
                                    void (*pmtud_notify_fn) (
                                                void *private_data,
                                                unsigned int data_mtu));
The module init would need to have an extra parameter to pass log_msg in
the void (*log_msg_fn) (.....));
At init time, it will store the pointer to log_msg somewhere internal
and it needs to do that only once. Should be fairly straightforward.
Fabio

2025

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

2010

Re: [kronosnet/kronosnet] a2f08e: Switch over all plugins to the module system