"Fabio M. Di Nitto" <fdinitto(a)redhat.com> writes:
On 11/28/2017 10:00 AM, Ferenc Wágner wrote:
"Fabio M. Di Nitto"
<fdinitto(a)redhat.com> writes:
let me stop you a minute here.
Thanks for looking at this experiment proactively! It reached the point
of viability right now, so let's chat about them if you have the time.
Yeps, I have got time :-)
Apparently more than me... I'm at home with a sick child, sorry for the
slow reaction.
what is the end goal of this work?
I tried to summarize it in 96a92847:
Our current practice of dlopening foreign shared libraries is problematic
for several reasons:
* not portable: modules and shared libraries can be different object types
* dependency information is invisible (our canaries mostly solve this)
* hardwiring SONAMES breaks on transitions (KNET_PKG_SONAME solves this)
* symbol versioning information is lost (theoretically solvable)
The preferred way out is generating dynamically loaded private modules
from the main source, which then rely on the dynamic linker to load the
external symbols as usual.
For a longer version please refer to my mail at
https://lists.debian.org/debian-mentors/2017/11/msg00200.html and the
links included by Guillem Jover.
Most of those concerns appears to be coming from a packing perspective.
I understand them, and some of them are valid.
I think it's a sharp observation. While knet is evolving faster than
all the plugin dependencies taken together, keeping dlopen at the outer
boundary can be advantageous.
We originally didn´t implement the modules model
because it has
several downsides for little benefit, based on the experience we had
with corosync modules in the past.
I'd be interested to learn about these downsides, could you please
provide some hints or keywords to search for?
I don´t think we have recorded them, but here is the rundown:
1) technically speaking using modules is no different than dlopening a
shared library.
Yes, on ELF platforms that's true.
In that respect, we are simply moving the problem
somewhere else. I
could agree (based on the threads above) that containing the problem
within the same upstream is probably saner.
It's the only way I know of to avoid hardwiring foreign library sonames.
Which differ from platform to platform. Detecting them (as we currently
do) works well enough in simple cases, but NSS is a pig already, having
its symbols spread into several shared objects. That would be rather
complicated to emulate with dlopens, not to mention symbol versioning.
Honestly, I haven't had the courage to dwelve into that yet.
2) it enforces a strict internal API/ABI between main
libknet and the
plugins. Making changes to those very complex, specially during
updates (see also point 4). Right now we want to keep those API/ABI
free to change and expand.
I'd think keeping both sides in the same upstream makes the internal
API/ABI a non-issue at the upstream level. Packaging can use strict
versioned dependencies between the plugin packages and the main library
to force upgrading them together (this only exposes the need to restart
the application after library/plugin upgrades). Other solutions are
module versioning (like library versioning, but independent) or plugin
directory versioning, which enable coexistence of different module ABI
versions (again, the module ABI could change keeping the library ABI
undisturbed). See also below.
3) it´s very difficult to debug modules due the symbol
resolving
mechanism (Honza in CC can provide more details).
For what it's worth, gdb works fine for me...
wferi@lant:~/ha/kronosnet/kronosnet/libknet/tests$ LD_LIBRARY_PATH=../.libs gdb
.libs/int_crypto_test
[...]
Reading symbols from .libs/int_crypto_test...done.
(gdb) b encrypt_nss
Function "encrypt_nss" not defined.
Make breakpoint pending on future shared library load? (y or [n]) y
Breakpoint 1 (encrypt_nss) pending.
(gdb) r
Starting program: /home/wferi/ha/kronosnet/kronosnet/libknet/tests/.libs/int_crypto_test
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib/x86_64-linux-gnu/libthread_db.so.1".
[New Thread 0x7ffff5054700 (LWP 1856)]
[New Thread 0x7ffff4853700 (LWP 1857)]
[New Thread 0x7ffff4052700 (LWP 1858)]
[New Thread 0x7ffff3851700 (LWP 1859)]
[New Thread 0x7ffff3050700 (LWP 1860)]
[New Thread 0x7ffff284f700 (LWP 1861)]
[New Thread 0x7ffff204e700 (LWP 1862)]
Test knet_handle_crypto with nss/aes128/sha1 and normal key
knet logs: [info] common: crypto_nss.so has been loaded from
/home/wferi/ha/kronosnet/kronosnet/libknet/tests/../.libs/crypto_nss.so
knet logs: [debug] crypto: Initizializing crypto module [nss/aes128/sha1]
knet logs: [debug] nsscrypto: Initizializing nss crypto module [aes128/sha1]
knet logs: [debug] crypto: security network overhead: 68
Source Data: Encrypt me!
Thread 1 "int_crypto_test" hit Breakpoint 1, nsscrypto_encrypt_and_signv
(iovcnt_in=1,
buf_out_len=0x7fffffffd0b0, buf_out=0x5555579090e0 "",
iov_in=0x7fffffffd030,
knet_h=0x7ffff5055010) at crypto_nss.c:625
625 if (encrypt_nss(knet_h, iov_in, iovcnt_in, buf_out, buf_out_len) < 0) {
(gdb)
#3 i am not happy about. specially given that the
project is still in
it´s "early" days. this specifically has proven very challenging
in some environments.
Nothing threatening comes to my mind at the moment, but I'd be happy to
look into any concrete problems brought up.
4) runtime operations (updates) can be nasty. You
could have
application X that has loaded libknet 1.0 with modules 1.0
apt-get update.. get libknet 1.1 or whatever, new plugins,
application tries to load a module and kaboom.
Well, yes. But at least it's in one hand: with modules you've got a
small internal ABI to watch out for, with direct dlopens you've got
several foreign ABIs to follow. Yes, development speed matters, but as
knet matures, the balance will inevitably tip, if it hasn't already.
#4 can be solved by adding some kind of
hashing/signing mechanism of
the modules (aka load only modules that match the build).
That sounds somewhat overkill... why not just use a version number if we
really must? Or introduce an extensible module ABI (with an explicit
size at the front of the model definition) and stay safe for a longer
term? Just thinking out loudly...
As for the code I have seen so far, please find
another way to pass
log_msg down to the plugins. I am not going to export it to the world :-)
Yeah, that's a wart. I don't know how to make that symbol accessible in
the usual way in the modules without including its code. And I wonder
what other internal symbols may need to be exposed later. Does the
format used in the log pipe constitute ABI? This might even be a show
stopper.
--
Thanks,
Feri