Re: [kronosnet/kronosnet] a2f08e: Switch over all plugins to the module system

List overview All Threads
Download

newer

older

plugin future (after 1.0)

kronosnet v0.8 (aka 1.0 Beta)...

Fabio M. Di Nitto

28 Nov 2017 28 Nov '17

7:57 a.m.

Hey Feri,

let me stop you a minute here.

what is the end goal of this work? We originally didn´t implement the modules model because it has several downsides for little benefit, based on the experience we had with corosync modules in the past.

I think it might be beneficial to have a chat about those major changes before you implement them :-) I *hate* for people to waste time on stuff that might not merge.

Cheers Fabio

On 11/28/2017 8:07 AM, GitHub wrote:

...

Branch: refs/heads/modules Home: https://github.com/kronosnet/kronosnet Commit: a2f08e189c0143381f9f3ac5031d6b5dd32f5432 https://github.com/kronosnet/kronosnet/commit/a2f08e189c0143381f9f3ac5031d6b... Author: Ferenc Wágner wferi@debian.org Date: 2017-11-28 (Tue, 28 Nov 2017)

Show replies by date

wferi＠niif.hu

28 Nov 28 Nov

9 a.m.

New subject: [kronosnet/kronosnet] a2f08e: Switch over all plugins to the module system

"Fabio M. Di Nitto" fdinitto@redhat.com writes:

...

let me stop you a minute here.

Hey Fabio,

Thanks for looking at this experiment proactively! It reached the point of viability right now, so let's chat about them if you have the time. (The patch set creates some rather obvious cleanup opportunities, but I guess we can leave them for later.)

...

what is the end goal of this work?

I tried to summarize it in 96a92847:

Our current practice of dlopening foreign shared libraries is problematic for several reasons: * not portable: modules and shared libraries can be different object types * dependency information is invisible (our canaries mostly solve this) * hardwiring SONAMES breaks on transitions (KNET_PKG_SONAME solves this) * symbol versioning information is lost (theoretically solvable)

The preferred way out is generating dynamically loaded private modules from the main source, which then rely on the dynamic linker to load the external symbols as usual.

For a longer version please refer to my mail at https://lists.debian.org/debian-mentors/2017/11/msg00200.html and the links included by Guillem Jover.

...

We originally didn´t implement the modules model because it has several downsides for little benefit, based on the experience we had with corosync modules in the past.

I'd be interested to learn about these downsides, could you please provide some hints or keywords to search for?

...

I think it might be beneficial to have a chat about those major changes before you implement them :-) I *hate* for people to waste time on stuff that might not merge.

Absolutely right. The true extent of the changes is exaggerated by the diffs (really, it just moves some definitions before their usage and removes lots of boilerplate -- wasting the entirety of my previous canary and soname works), but I'm still proposing an architectural change. Those preprocessor tricks were neat; modules seem to be a better solution to this problem set. Now tell me why they aren't. :)

-- Thanks, Feri

Fabio M. Di Nitto

9:33 a.m.

New subject: [kronosnet/kronosnet] a2f08e: Switch over all plugins to the module system

On 11/28/2017 10:00 AM, Ferenc Wágner wrote:

...

"Fabio M. Di Nitto" fdinitto@redhat.com writes:

...
let me stop you a minute here.

Hey Fabio,

Thanks for looking at this experiment proactively! It reached the point of viability right now, so let's chat about them if you have the time.

Yeps, I have got time :-)

...

(The patch set creates some rather obvious cleanup opportunities, but I guess we can leave them for later.)

...
what is the end goal of this work?

I tried to summarize it in 96a92847:
Our current practice of dlopening foreign shared libraries is problematic
for several reasons:
* not portable: modules and shared libraries can be different object types
* dependency information is invisible (our canaries mostly solve this)
* hardwiring SONAMES breaks on transitions (KNET_PKG_SONAME solves this)
* symbol versioning information is lost (theoretically solvable)

The preferred way out is generating dynamically loaded private modules
from the main source, which then rely on the dynamic linker to load the
external symbols as usual.
For a longer version please refer to my mail at https://lists.debian.org/debian-mentors/2017/11/msg00200.html and the links included by Guillem Jover.

Most of those concerns appears to be coming from a packing perspective. I understand them, and some of them are valid.

...

...
We originally didn´t implement the modules model because it has several downsides for little benefit, based on the experience we had with corosync modules in the past.

I'd be interested to learn about these downsides, could you please provide some hints or keywords to search for?

I don´t think we have recorded them, but here is the rundown:

1) technically speaking using modules is no different than dlopening a shared library. In that respect, we are simply moving the problem somewhere else. I could agree (based on the threads above) that containing the problem within the same upstream is probably saner. 2) it enforces a strict internal API/ABI between main libknet and the plugins. Making changes to those very complex, specially during updates (see also point 4). Right now we want to keep those API/ABI free to change and expand. 3) it´s very difficult to debug modules due the symbol resolving mechanism (Honza in CC can provide more details). 4) runtime operations (updates) can be nasty. You could have application X that has loaded libknet 1.0 with modules 1.0 apt-get update.. get libknet 1.1 or whatever, new plugins, application tries to load a module and kaboom.

#1 is mostly political/perception really. I don´t feel arguing much about it, but the main point is that we are a very active upstream and I don´t see a big issue (for now) to keep tracking other libraries. #2 i can probably live with, but that would have to be a project/group call. #3 i am not happy about. specially given that the project is still in it´s "early" days. this specifically has proven very challenging in some environments. #4 can be solved by adding some kind of hashing/signing mechanism of the modules (aka load only modules that match the build).

...

...
I think it might be beneficial to have a chat about those major changes before you implement them :-) I *hate* for people to waste time on stuff that might not merge.

Absolutely right. The true extent of the changes is exaggerated by the diffs (really, it just moves some definitions before their usage and removes lots of boilerplate -- wasting the entirety of my previous canary and soname works), but I'm still proposing an architectural change. Those preprocessor tricks were neat; modules seem to be a better solution to this problem set. Now tell me why they aren't. :)

The patch size doesn´t scare me at all.

I am really concerned about point #2 and point #3 above. #4 can be technically solved.

As for the code I have seen so far, please find another way to pass log_msg down to the plugins. I am not going to export it to the world :-)

Fabio

Fabio M. Di Nitto

3:15 p.m.

New subject: [kronosnet/kronosnet] a2f08e: Switch over all plugins to the module system

Hi Feri,

I have been talking with Jan and Chrissie about this branch, pro and cons etc.

I think we are all in agreement that we like your move towards modules and we would like you to continue this way.

There are few bits and pieces of the code we need to fix tho before I can merge any change of this magnitude.

1) load_compress_lib and load_crypto_lib should be able to stay inside compress.c and crypto.c. I don´t think there is a need to move them in common.c

2) we need to define something like #define KNET_INTERNAL_CRYPTO_API_VER so that we can use that either to identify the path on disk where to load the modules (ex: pkglibdir/crypto/$VER/ or $VER/crypto) or checked at load time (or both). this should address point 4 below.

3) log_msg should not be exported in the public API. This is a no-no from me ;) Either pass the pointer to the function to the init of a given module, or let´s find another way to report errors back from the modules to main libknet.

4) You need to start adding your name as Author/Contributor around or I´ll stop accepting your patches ;) that includes signing-off your commits :-)

Cheers Fabio

On 11/28/2017 10:33 AM, Fabio M. Di Nitto wrote:

...

On 11/28/2017 10:00 AM, Ferenc Wágner wrote:

...
"Fabio M. Di Nitto" fdinitto@redhat.com writes:

...
let me stop you a minute here.

Hey Fabio,

Thanks for looking at this experiment proactively! It reached the point of viability right now, so let's chat about them if you have the time.

Yeps, I have got time :-)

...
(The patch set creates some rather obvious cleanup opportunities, but I guess we can leave them for later.)

...
what is the end goal of this work?

I tried to summarize it in 96a92847:
Our current practice of dlopening foreign shared libraries is problematic
for several reasons:
* not portable: modules and shared libraries can be different object types
* dependency information is invisible (our canaries mostly solve this)
* hardwiring SONAMES breaks on transitions (KNET_PKG_SONAME solves this)
* symbol versioning information is lost (theoretically solvable)

The preferred way out is generating dynamically loaded private modules
from the main source, which then rely on the dynamic linker to load the
external symbols as usual.
For a longer version please refer to my mail at https://lists.debian.org/debian-mentors/2017/11/msg00200.html and the links included by Guillem Jover.
Most of those concerns appears to be coming from a packing perspective. I understand them, and some of them are valid.

...
...
We originally didn´t implement the modules model because it has several downsides for little benefit, based on the experience we had with corosync modules in the past.

I'd be interested to learn about these downsides, could you please provide some hints or keywords to search for?

I don´t think we have recorded them, but here is the rundown:

technically speaking using modules is no different than dlopening a

shared library. In that respect, we are simply moving the problem somewhere else. I could agree (based on the threads above) that containing the problem within the same upstream is probably saner. 2) it enforces a strict internal API/ABI between main libknet and the plugins. Making changes to those very complex, specially during updates (see also point 4). Right now we want to keep those API/ABI free to change and expand. 3) it´s very difficult to debug modules due the symbol resolving mechanism (Honza in CC can provide more details). 4) runtime operations (updates) can be nasty. You could have application X that has loaded libknet 1.0 with modules 1.0 apt-get update.. get libknet 1.1 or whatever, new plugins, application tries to load a module and kaboom.

#1 is mostly political/perception really. I don´t feel arguing much about it, but the main point is that we are a very active upstream and I don´t see a big issue (for now) to keep tracking other libraries. #2 i can probably live with, but that would have to be a project/group call. #3 i am not happy about. specially given that the project is still in it´s "early" days. this specifically has proven very challenging in some environments. #4 can be solved by adding some kind of hashing/signing mechanism of the modules (aka load only modules that match the build).

...
...
I think it might be beneficial to have a chat about those major changes before you implement them :-) I *hate* for people to waste time on stuff that might not merge.

Absolutely right. The true extent of the changes is exaggerated by the diffs (really, it just moves some definitions before their usage and removes lots of boilerplate -- wasting the entirety of my previous canary and soname works), but I'm still proposing an architectural change. Those preprocessor tricks were neat; modules seem to be a better solution to this problem set. Now tell me why they aren't. :)

The patch size doesn´t scare me at all.

I am really concerned about point #2 and point #3 above. #4 can be technically solved.

As for the code I have seen so far, please find another way to pass log_msg down to the plugins. I am not going to export it to the world :-)

Fabio

Devel mailing list Devel@lists.kronosnet.org http://lists.kronosnet.org/mailman/listinfo/devel

Fabio M. Di Nitto

29 Nov 29 Nov

7:50 a.m.

New subject: [kronosnet/kronosnet] a2f08e: Switch over all plugins to the module system

Hey Feri,

quick update.

On 11/28/2017 4:15 PM, Fabio M. Di Nitto wrote:

...

Hi Feri,

I have been talking with Jan and Chrissie about this branch, pro and cons etc.

I think we are all in agreement that we like your move towards modules and we would like you to continue this way.

There are few bits and pieces of the code we need to fix tho before I can merge any change of this magnitude.

load_compress_lib and load_crypto_lib should be able to stay inside compress.c and crypto.c. I don´t think there is a need to move them in common.c

this is done in the current module branch.

...

we need to define something like #define KNET_INTERNAL_CRYPTO_API_VER so that we can use that either to identify the path on disk where to load the modules (ex: pkglibdir/crypto/$VER/ or $VER/crypto) or checked at load time (or both). this should address point 4 below.

log_msg should not be exported in the public API. This is a no-no from me ;) Either pass the pointer to the function to the init of a given module, or let´s find another way to report errors back from the modules to main libknet.

this is done in the current module branch.

...

You need to start adding your name as Author/Contributor around or I´ll stop accepting your patches ;) that includes signing-off your commits :-)

Cheers Fabio

On 11/28/2017 10:33 AM, Fabio M. Di Nitto wrote:

...
On 11/28/2017 10:00 AM, Ferenc Wágner wrote:

...
"Fabio M. Di Nitto" fdinitto@redhat.com writes:

...
let me stop you a minute here.

Hey Fabio,

Thanks for looking at this experiment proactively! It reached the point of viability right now, so let's chat about them if you have the time.

Yeps, I have got time :-)

...
(The patch set creates some rather obvious cleanup opportunities, but I guess we can leave them for later.)

...
what is the end goal of this work?

I tried to summarize it in 96a92847:
Our current practice of dlopening foreign shared libraries is problematic
for several reasons:
* not portable: modules and shared libraries can be different object types
* dependency information is invisible (our canaries mostly solve this)
* hardwiring SONAMES breaks on transitions (KNET_PKG_SONAME solves this)
* symbol versioning information is lost (theoretically solvable)

The preferred way out is generating dynamically loaded private modules
from the main source, which then rely on the dynamic linker to load the
external symbols as usual.
For a longer version please refer to my mail at https://lists.debian.org/debian-mentors/2017/11/msg00200.html and the links included by Guillem Jover.
Most of those concerns appears to be coming from a packing perspective. I understand them, and some of them are valid.

...
...
We originally didn´t implement the modules model because it has several downsides for little benefit, based on the experience we had with corosync modules in the past.

I'd be interested to learn about these downsides, could you please provide some hints or keywords to search for?

I don´t think we have recorded them, but here is the rundown:

technically speaking using modules is no different than dlopening a

shared library. In that respect, we are simply moving the problem somewhere else. I could agree (based on the threads above) that containing the problem within the same upstream is probably saner. 2) it enforces a strict internal API/ABI between main libknet and the plugins. Making changes to those very complex, specially during updates (see also point 4). Right now we want to keep those API/ABI free to change and expand. 3) it´s very difficult to debug modules due the symbol resolving mechanism (Honza in CC can provide more details). 4) runtime operations (updates) can be nasty. You could have application X that has loaded libknet 1.0 with modules 1.0 apt-get update.. get libknet 1.1 or whatever, new plugins, application tries to load a module and kaboom.

#1 is mostly political/perception really. I don´t feel arguing much about it, but the main point is that we are a very active upstream and I don´t see a big issue (for now) to keep tracking other libraries. #2 i can probably live with, but that would have to be a project/group call. #3 i am not happy about. specially given that the project is still in it´s "early" days. this specifically has proven very challenging in some environments. #4 can be solved by adding some kind of hashing/signing mechanism of the modules (aka load only modules that match the build).

...
...
I think it might be beneficial to have a chat about those major changes before you implement them :-) I *hate* for people to waste time on stuff that might not merge.

Absolutely right. The true extent of the changes is exaggerated by the diffs (really, it just moves some definitions before their usage and removes lots of boilerplate -- wasting the entirety of my previous canary and soname works), but I'm still proposing an architectural change. Those preprocessor tricks were neat; modules seem to be a better solution to this problem set. Now tell me why they aren't. :)

The patch size doesn´t scare me at all.

I am really concerned about point #2 and point #3 above. #4 can be technically solved.

As for the code I have seen so far, please find another way to pass log_msg down to the plugins. I am not going to export it to the world :-)

Fabio

Devel mailing list Devel@lists.kronosnet.org http://lists.kronosnet.org/mailman/listinfo/devel
Devel mailing list Devel@lists.kronosnet.org https://lists.kronosnet.org/mailman/listinfo/devel

Fabio M. Di Nitto

8:47 a.m.

New subject: [kronosnet/kronosnet] a2f08e: Switch over all plugins to the module system

On 11/29/2017 8:50 AM, Fabio M. Di Nitto wrote:

...

Hey Feri,

quick update.

On 11/28/2017 4:15 PM, Fabio M. Di Nitto wrote:

...
Hi Feri,

I have been talking with Jan and Chrissie about this branch, pro and cons etc.

I think we are all in agreement that we like your move towards modules and we would like you to continue this way.

There are few bits and pieces of the code we need to fix tho before I can merge any change of this magnitude.

load_compress_lib and load_crypto_lib should be able to stay inside compress.c and crypto.c. I don´t think there is a need to move them in common.c

this is done in the current module branch.

...

we need to define something like #define KNET_INTERNAL_CRYPTO_API_VER so that we can use that either to identify the path on disk where to load the modules (ex: pkglibdir/crypto/$VER/ or $VER/crypto) or checked at load time (or both). this should address point 4 below.

For now I added an internal API version check. I am not sure yet we need to add also versioning on disk. It might be overkilling. What do you think?

Fabio

...

...

log_msg should not be exported in the public API. This is a no-no from me ;) Either pass the pointer to the function to the init of a given module, or let´s find another way to report errors back from the modules to main libknet.

this is done in the current module branch.

...
You need to start adding your name as Author/Contributor around or I´ll stop accepting your patches ;) that includes signing-off your commits :-)

Cheers Fabio

On 11/28/2017 10:33 AM, Fabio M. Di Nitto wrote:

...
On 11/28/2017 10:00 AM, Ferenc Wágner wrote:

...
"Fabio M. Di Nitto" fdinitto@redhat.com writes:

...
let me stop you a minute here.

Hey Fabio,

Thanks for looking at this experiment proactively! It reached the point of viability right now, so let's chat about them if you have the time.

Yeps, I have got time :-)

...
(The patch set creates some rather obvious cleanup opportunities, but I guess we can leave them for later.)

...
what is the end goal of this work?

I tried to summarize it in 96a92847:
Our current practice of dlopening foreign shared libraries is problematic
for several reasons:
* not portable: modules and shared libraries can be different object types
* dependency information is invisible (our canaries mostly solve this)
* hardwiring SONAMES breaks on transitions (KNET_PKG_SONAME solves this)
* symbol versioning information is lost (theoretically solvable)

The preferred way out is generating dynamically loaded private modules
from the main source, which then rely on the dynamic linker to load the
external symbols as usual.
For a longer version please refer to my mail at https://lists.debian.org/debian-mentors/2017/11/msg00200.html and the links included by Guillem Jover.
Most of those concerns appears to be coming from a packing perspective. I understand them, and some of them are valid.

...
...
We originally didn´t implement the modules model because it has several downsides for little benefit, based on the experience we had with corosync modules in the past.

I'd be interested to learn about these downsides, could you please provide some hints or keywords to search for?

I don´t think we have recorded them, but here is the rundown:

technically speaking using modules is no different than dlopening a

shared library. In that respect, we are simply moving the problem somewhere else. I could agree (based on the threads above) that containing the problem within the same upstream is probably saner. 2) it enforces a strict internal API/ABI between main libknet and the plugins. Making changes to those very complex, specially during updates (see also point 4). Right now we want to keep those API/ABI free to change and expand. 3) it´s very difficult to debug modules due the symbol resolving mechanism (Honza in CC can provide more details). 4) runtime operations (updates) can be nasty. You could have application X that has loaded libknet 1.0 with modules 1.0 apt-get update.. get libknet 1.1 or whatever, new plugins, application tries to load a module and kaboom.

#1 is mostly political/perception really. I don´t feel arguing much about it, but the main point is that we are a very active upstream and I don´t see a big issue (for now) to keep tracking other libraries. #2 i can probably live with, but that would have to be a project/group call. #3 i am not happy about. specially given that the project is still in it´s "early" days. this specifically has proven very challenging in some environments. #4 can be solved by adding some kind of hashing/signing mechanism of the modules (aka load only modules that match the build).

...
...
I think it might be beneficial to have a chat about those major changes before you implement them :-) I *hate* for people to waste time on stuff that might not merge.

Absolutely right. The true extent of the changes is exaggerated by the diffs (really, it just moves some definitions before their usage and removes lots of boilerplate -- wasting the entirety of my previous canary and soname works), but I'm still proposing an architectural change. Those preprocessor tricks were neat; modules seem to be a better solution to this problem set. Now tell me why they aren't. :)

The patch size doesn´t scare me at all.

I am really concerned about point #2 and point #3 above. #4 can be technically solved.

As for the code I have seen so far, please find another way to pass log_msg down to the plugins. I am not going to export it to the world :-)

Fabio

Devel mailing list Devel@lists.kronosnet.org http://lists.kronosnet.org/mailman/listinfo/devel
Devel mailing list Devel@lists.kronosnet.org https://lists.kronosnet.org/mailman/listinfo/devel
Devel mailing list Devel@lists.kronosnet.org https://lists.kronosnet.org/mailman/listinfo/devel

wferi＠niif.hu

1:52 p.m.

New subject: [kronosnet/kronosnet] a2f08e: Switch over all plugins to the module system

"Fabio M. Di Nitto" fabbione@fabbione.net writes:

...

On 11/29/2017 8:50 AM, Fabio M. Di Nitto wrote:

...
On 11/28/2017 4:15 PM, Fabio M. Di Nitto wrote:

...

load_compress_lib and load_crypto_lib should be able to stay inside compress.c and crypto.c. I don´t think there is a need to move them in common.c

this is done in the current module branch.

Hi Fabio,

Thanks for fixing this up. I parked them in common.c in the hope of future unification. The only algorithmic difference was the extra initialization call in load_crypto_lib(), which I really didn't understand in the light of the already existing init method, but left alone for the time. Now I see you already got rid of it, which is great.

Going further, the copying of the individual pointers could be replaced by defining pure "ops" structures in the modules, and only copying their single addresses into the respective ops pointers of the model structures. The rest is just string differences as far as I remember.

There is also the lz4/lz4hc duality. Those models share a decompress method, which is currently duplicated in the two modules. It's no big deal because that code is tiny, but in the future it might be useful to contain the model name in the name of the exported symbol, optionally decoupling it from the module soname.

...

...
...

we need to define something like #define KNET_INTERNAL_CRYPTO_API_VER so that we can use that either to identify the path on disk where to load the modules (ex: pkglibdir/crypto/$VER/ or $VER/crypto) or checked at load time (or both). this should address point 4 below.

For now I added an internal API version check. I am not sure yet we need to add also versioning on disk. It might be overkilling. What do you think?

I think module directory versioning would be better, as it allows coexistence of different module ABIs. That can help in your upgrade scenario, if the (already loaded and running) core library is upgraded and the modules with the old ABI version are left installed.

Speaking about the version check: is it necessary to store the version in the model description? It's a generic constant after all.

...

...
...

log_msg should not be exported in the public API. This is a no-no from me ;) Either pass the pointer to the function to the init of a given module, or let´s find another way to report errors back from the modules to main libknet.

this is done in the current module branch.

Similarly to the above, is the function pointer worth storing in every handle? I understand it's handy, because it's being passed around anyway. My idea was writing it into another module symbol at load time and defining the logging macros to use that when compiling modules. Probably doesn't matter much, though.

I hope to get back to coding instead of theorizing soon... and I've just subscribed to devel@l.k.o to facilitate email communication. :)

-- Regards, Feri

Fabio M. Di Nitto

2:08 p.m.

New subject: [kronosnet/kronosnet] a2f08e: Switch over all plugins to the module system

On 11/29/2017 2:52 PM, Ferenc Wágner wrote:

...

"Fabio M. Di Nitto" fabbione@fabbione.net writes:

...
On 11/29/2017 8:50 AM, Fabio M. Di Nitto wrote:

...
On 11/28/2017 4:15 PM, Fabio M. Di Nitto wrote:

...

load_compress_lib and load_crypto_lib should be able to stay inside compress.c and crypto.c. I don´t think there is a need to move them in common.c

this is done in the current module branch.

Hi Fabio,

Thanks for fixing this up. I parked them in common.c in the hope of future unification. The only algorithmic difference was the extra initialization call in load_crypto_lib(), which I really didn't understand in the light of the already existing init method, but left alone for the time. Now I see you already got rid of it, which is great.

...

Going further, the copying of the individual pointers could be replaced by defining pure "ops" structures in the modules, and only copying their single addresses into the respective ops pointers of the model structures. The rest is just string differences as far as I remembe

sounds about right.

...

There is also the lz4/lz4hc duality. Those models share a decompress method, which is currently duplicated in the two modules. It's no big deal because that code is tiny, but in the future it might be useful to contain the model name in the name of the exported symbol, optionally decoupling it from the module soname.

I have been thinking about it too but I don´t think it´s too urgent.

...

...
...
...

we need to define something like #define KNET_INTERNAL_CRYPTO_API_VER so that we can use that either to identify the path on disk where to load the modules (ex: pkglibdir/crypto/$VER/ or $VER/crypto) or checked at load time (or both). this should address point 4 below.

For now I added an internal API version check. I am not sure yet we need to add also versioning on disk. It might be overkilling. What do you think?

I think module directory versioning would be better, as it allows coexistence of different module ABIs. That can help in your upgrade scenario, if the (already loaded and running) core library is upgraded and the modules with the old ABI version are left installed.

libknet1-1.0-x has internal API 1 libknet1-1.0-y has internal API 2

on update, the old dir would go away no matter what, unless you want to do metapackages similar to kernel and keep them around. IMHO, not worth it.

...

Speaking about the version check: is it necessary to store the version in the model description? It's a generic constant after all.

It is the easiest way to compare them and it´s only a uint8_t. Hardly a resource issue ;)

I tested doing build with version 1, copy the modules, do a version 2 build, overwrite the modules, etc.. each module needs to know its own version.

...

...
...
...

log_msg should not be exported in the public API. This is a no-no from me ;) Either pass the pointer to the function to the init of a given module, or let´s find another way to report errors back from the modules to main libknet.

this is done in the current module branch.

Similarly to the above, is the function pointer worth storing in every handle? I understand it's handy, because it's being passed around anyway. My idea was writing it into another module symbol at load time and defining the logging macros to use that when compiling modules. Probably doesn't matter much, though.

I have tried to write it into a module symbol, but there was a remapping issue that had to be done per module. This one works across the board without per module changes. It´s a minor technicality IMHO.

...

I hope to get back to coding instead of theorizing soon... and I've just subscribed to devel@l.k.o to facilitate email communication. :)

Cool.. no worries. take your time.

Fabio

wferi＠niif.hu

30 Nov 30 Nov

2:26 p.m.

New subject: [kronosnet/kronosnet] a2f08e: Switch over all plugins to the module system

"Fabio M. Di Nitto" fdinitto@redhat.com writes:

...

On 11/29/2017 2:52 PM, Ferenc Wágner wrote:

...
"Fabio M. Di Nitto" fabbione@fabbione.net writes:

...
On 11/29/2017 8:50 AM, Fabio M. Di Nitto wrote:

...
On 11/28/2017 4:15 PM, Fabio M. Di Nitto wrote:

...

we need to define something like #define KNET_INTERNAL_CRYPTO_API_VER so that we can use that either to identify the path on disk where to load the modules (ex: pkglibdir/crypto/$VER/ or $VER/crypto) or checked at load time (or both). this should address point 4 below.

For now I added an internal API version check. I am not sure yet we need to add also versioning on disk. It might be overkilling. What do you think?

I think module directory versioning would be better, as it allows coexistence of different module ABIs. That can help in your upgrade scenario, if the (already loaded and running) core library is upgraded and the modules with the old ABI version are left installed.

libknet1-1.0-x has internal API 1 libknet1-1.0-y has internal API 2

on update, the old dir would go away no matter what, unless you want to do metapackages similar to kernel and keep them around. IMHO, not worth it.

Consider the plugin packages as well:

libknet-1-crypto-nss ships /usr/lib/kronosnet-1/crypto_nss.so (ABI 1) libknet-2-crypto-nss ships /usr/lib/kronosnet-2/crypto_nss.so (ABI 2)

libknet1_x has internal ABI 1, recommends libknet-1-crypto-nss libknet1_y has internal ABI 2, recommends libknet-2-crypto-nss

Running application depends on libknet1, loads libknet.so.1.x into memory. Upgrade libknet1 to version y, it pulls in the new package libknet-2-crypto-nss. Running application can still load its ABI 1 NSS module, because libknet-1-crypto-nss is still installed. One restarted, it loads the new libknet.so.1.y, which can load the ABI 2 NSS module. It's all seamless from the user PoV if libknet-1-crypto-nss isn't removed too early.

The risky part is that libknet1_y may not pull libknet-2-crypto-nss strongly enough. Debian installs recommended packages by default, but not suggested ones. So suggested plugins may be lost during a minor upgrade of libknet1, which doesn't sound very kind... I haven't thought about metapackages yet.

All this complication does not arise if we just fail cleanly based on version checks. Which may be good enough for corosync, but hypothetic other cases may trigger module loading based on foreign network traffic, if I understood correctly. Anyway, I can't see problems adding directory versioning later if needed.

...

...
Speaking about the version check: is it necessary to store the version in the model description? It's a generic constant after all.

It is the easiest way to compare them and it´s only a uint8_t. Hardly a resource issue ;)

Certainly not, but it begs the question that when these ABI versions can be different. Actually, never. So why store it in each model instead of simply compiling it into the module loader routine?

...

I tested doing build with version 1, copy the modules, do a version 2 build, overwrite the modules, etc.. each module needs to know its own version.

Sure, it's compiled into each module as well. My point is that all models come from a single build of libknet.so, while the modules may come from different builds.

...

...
...
...
...

log_msg should not be exported in the public API. This is a no-no from me ;) Either pass the pointer to the function to the init of a given module, or let´s find another way to report errors back from the modules to main libknet.

this is done in the current module branch.

Similarly to the above, is the function pointer worth storing in every handle? I understand it's handy, because it's being passed around anyway. My idea was writing it into another module symbol at load time and defining the logging macros to use that when compiling modules. Probably doesn't matter much, though.

I have tried to write it into a module symbol, but there was a remapping issue that had to be done per module. This one works across the board without per module changes. It´s a minor technicality IMHO.

You made me curious. Let me try to implement it... OK, I rebased and squashed my branch somewhat. I'm pushing it as modules2, we had some very similar ideas and some different ones!

-- Regards, Feri

Fabio M. Di Nitto

3:14 p.m.

New subject: [kronosnet/kronosnet] a2f08e: Switch over all plugins to the module system

On 11/30/2017 3:26 PM, Ferenc Wágner wrote:

...

"Fabio M. Di Nitto" fdinitto@redhat.com writes:

...
On 11/29/2017 2:52 PM, Ferenc Wágner wrote:

...
"Fabio M. Di Nitto" fabbione@fabbione.net writes:

...
On 11/29/2017 8:50 AM, Fabio M. Di Nitto wrote:

...
On 11/28/2017 4:15 PM, Fabio M. Di Nitto wrote:

...

we need to define something like #define KNET_INTERNAL_CRYPTO_API_VER so that we can use that either to identify the path on disk where to load the modules (ex: pkglibdir/crypto/$VER/ or $VER/crypto) or checked at load time (or both). this should address point 4 below.

For now I added an internal API version check. I am not sure yet we need to add also versioning on disk. It might be overkilling. What do you think?

I think module directory versioning would be better, as it allows coexistence of different module ABIs. That can help in your upgrade scenario, if the (already loaded and running) core library is upgraded and the modules with the old ABI version are left installed.

libknet1-1.0-x has internal API 1 libknet1-1.0-y has internal API 2

on update, the old dir would go away no matter what, unless you want to do metapackages similar to kernel and keep them around. IMHO, not worth it.

Consider the plugin packages as well:

libknet-1-crypto-nss ships /usr/lib/kronosnet-1/crypto_nss.so (ABI 1) libknet-2-crypto-nss ships /usr/lib/kronosnet-2/crypto_nss.so (ABI 2)

libknet1_x has internal ABI 1, recommends libknet-1-crypto-nss libknet1_y has internal ABI 2, recommends libknet-2-crypto-nss

Running application depends on libknet1, loads libknet.so.1.x into memory. Upgrade libknet1 to version y, it pulls in the new package libknet-2-crypto-nss. Running application can still load its ABI 1 NSS module, because libknet-1-crypto-nss is still installed. One restarted, it loads the new libknet.so.1.y, which can load the ABI 2 NSS module. It's all seamless from the user PoV if libknet-1-crypto-nss isn't removed too early.

that I understand, but you end up leaving craft behind on the disk and packages that are not in use anymore. the libknet1-plugin1-whatever-name is obsoleted, but nothing removes it.

...

The risky part is that libknet1_y may not pull libknet-2-crypto-nss strongly enough. Debian installs recommended packages by default, but not suggested ones. So suggested plugins may be lost during a minor upgrade of libknet1, which doesn't sound very kind... I haven't thought about metapackages yet.

In the rpm world the problem doesn´t really exist because there is no real concept of recommend or suggest. to be honest you could easily just Depends: on all of them. The disk space usage is ridiculously low even if you install all of it. The real benefit of the whole plugin/dlopen is to save runtime resources.

...

All this complication does not arise if we just fail cleanly based on version checks. Which may be good enough for corosync, but hypothetic other cases may trigger module loading based on foreign network traffic, if I understood correctly. Anyway, I can't see problems adding directory versioning later if needed.

Right, I am not saying we will never do disk space versioning, I would say we check first if we really need it :-)

...

...
...
Speaking about the version check: is it necessary to store the version in the model description? It's a generic constant after all.

It is the easiest way to compare them and it´s only a uint8_t. Hardly a resource issue ;)

Certainly not, but it begs the question that when these ABI versions can be different. Actually, never. So why store it in each model instead of simply compiling it into the module loader routine?

If you are looking at it from a pure packaging perspective, with a perfectly clean system and a user doesn´t try to cheat the system, then yes you are right it cannot happen.

But, if you are a developer with a dirty installation, or some sysadmin trying to use an obsoleted module on a new build, then it will catch the issue.

...

...
I tested doing build with version 1, copy the modules, do a version 2 build, overwrite the modules, etc.. each module needs to know its own version.

Sure, it's compiled into each module as well. My point is that all models come from a single build of libknet.so, while the modules may come from different builds.

Ok perhaps I am not understanding what you mean, but that´s fine. Just show me the code and I´ll look at it :-)

...

...
...
...
...
...

log_msg should not be exported in the public API. This is a no-no from me ;) Either pass the pointer to the function to the init of a given module, or let´s find another way to report errors back from the modules to main libknet.

this is done in the current module branch.

Similarly to the above, is the function pointer worth storing in every handle? I understand it's handy, because it's being passed around anyway. My idea was writing it into another module symbol at load time and defining the logging macros to use that when compiling modules. Probably doesn't matter much, though.

I have tried to write it into a module symbol, but there was a remapping issue that had to be done per module. This one works across the board without per module changes. It´s a minor technicality IMHO.

You made me curious. Let me try to implement it... OK, I rebased and squashed my branch somewhat. I'm pushing it as modules2, we had some very similar ideas and some different ones!

Sure thing. and that´s right, but show me what you mean and I´ll happily look at it.

Fabio

Fabio M. Di Nitto

3:21 p.m.

New subject: [kronosnet/kronosnet] a2f08e: Switch over all plugins to the module system

On 11/30/2017 3:26 PM, Ferenc Wágner wrote:

...

You made me curious. Let me try to implement it... OK, I rebased and squashed my branch somewhat. I'm pushing it as modules2, we had some very similar ideas and some different ones!

can you please rebase it on top of the latest modules branch? it´s missing / reverting the latest spec file changes.

Fabio

wferi＠niif.hu

3:54 p.m.

New subject: [kronosnet/kronosnet] a2f08e: Switch over all plugins to the module system

"Fabio M. Di Nitto" fdinitto@redhat.com writes:

...

On 11/30/2017 3:26 PM, Ferenc Wágner wrote:

...
You made me curious. Let me try to implement it... OK, I rebased and squashed my branch somewhat. I'm pushing it as modules2, we had some very similar ideas and some different ones!

can you please rebase it on top of the latest modules branch? it´s missing / reverting the latest spec file changes.

Not easily, but I cherry-picked them, hope it helps.

-- Feri

Fabio M. Di Nitto

4:13 p.m.

New subject: [kronosnet/kronosnet] a2f08e: Switch over all plugins to the module system

On 11/30/2017 4:54 PM, Ferenc Wágner wrote:

...

"Fabio M. Di Nitto" fdinitto@redhat.com writes:

...
On 11/30/2017 3:26 PM, Ferenc Wágner wrote:

...
You made me curious. Let me try to implement it... OK, I rebased and squashed my branch somewhat. I'm pushing it as modules2, we had some very similar ideas and some different ones!

can you please rebase it on top of the latest modules branch? it´s missing / reverting the latest spec file changes.

Not easily, but I cherry-picked them, hope it helps.

Ack, not a problem. We might need to cherry pick a bit more (like the valgrind exception for BSD), but let see.

Also, I see you are #define KNET_MODULE for each module to handle the log_msg and you need to pass log_msg symbols around on each load.

I am not against that patch, but i think just passing the pointer around via knet_h is less invasive and needs less "magic" :-)

Fabio

PS I love the work on configure.ac, that´s definitely going in.

wferi＠niif.hu

4:33 p.m.

New subject: [kronosnet/kronosnet] a2f08e: Switch over all plugins to the module system

"Fabio M. Di Nitto" fdinitto@redhat.com writes:

...

On 11/30/2017 4:54 PM, Ferenc Wágner wrote:

...
"Fabio M. Di Nitto" fdinitto@redhat.com writes:

...
On 11/30/2017 3:26 PM, Ferenc Wágner wrote:

...
You made me curious. Let me try to implement it... OK, I rebased and squashed my branch somewhat. I'm pushing it as modules2, we had some very similar ideas and some different ones!

can you please rebase it on top of the latest modules branch? it´s missing / reverting the latest spec file changes.

Not easily, but I cherry-picked them, hope it helps.

Ack, not a problem. We might need to cherry pick a bit more (like the valgrind exception for BSD), but let see.

Done, thanks. I hope this will help the BSD builds...

...

Also, I see you are #define KNET_MODULE for each module to handle the log_msg and you need to pass log_msg symbols around on each load.

I am not against that patch, but i think just passing the pointer around via knet_h is less invasive and needs less "magic" :-)

Either way works for me, I don't know enough about knet_h to tell whether a pointer to log_msg belongs there. Mainly, I wanted to try and offer this approach. Since it's a fixed value, storing it separately in each handle seems a little awkward. BTW it's possible to get rid of the KNET_MODULE defines by using indirection even inside libknet core. Efficiency surely suffers more due to my ops structure than this. But again I find that straightforward, so maybe worth it... It's a pity it has to go through a void * in load_module().

...

PS I love the work on configure.ac, that´s definitely going in.

I spent quite some time trying to avoid introducing new defines, but no preprocessor magic I could come up with cut it. On the other hand, the original BUILD* macros now see very little use (two tests only). Splitting those tests would let us get rid of the defines and use the identically named Automake conditionals exclusively. Which would be a nice cleanup in my opinion.

-- Regards, Feri

Fabio M. Di Nitto

4:49 p.m.

New subject: [kronosnet/kronosnet] a2f08e: Switch over all plugins to the module system

On 11/30/2017 5:33 PM, Ferenc Wágner wrote:

...

"Fabio M. Di Nitto" fdinitto@redhat.com writes:

...
On 11/30/2017 4:54 PM, Ferenc Wágner wrote:

...
"Fabio M. Di Nitto" fdinitto@redhat.com writes:

...
On 11/30/2017 3:26 PM, Ferenc Wágner wrote:

...
You made me curious. Let me try to implement it... OK, I rebased and squashed my branch somewhat. I'm pushing it as modules2, we had some very similar ideas and some different ones!

can you please rebase it on top of the latest modules branch? it´s missing / reverting the latest spec file changes.

Not easily, but I cherry-picked them, hope it helps.

Ack, not a problem. We might need to cherry pick a bit more (like the valgrind exception for BSD), but let see.

Done, thanks. I hope this will help the BSD builds...

...
Also, I see you are #define KNET_MODULE for each module to handle the log_msg and you need to pass log_msg symbols around on each load.

I am not against that patch, but i think just passing the pointer around via knet_h is less invasive and needs less "magic" :-)

Either way works for me, I don't know enough about knet_h to tell whether a pointer to log_msg belongs there. Mainly, I wanted to try and offer this approach. Since it's a fixed value, storing it separately in each handle seems a little awkward.

Yeah I see your point. knet_h is completely hidden from the user, so it doesn´t matter what we store there internally (so to speak).

Most applications will have one or two handles, I doubt we will see a proliferation of many handles :-)

...

BTW it's possible to get rid of the KNET_MODULE defines by using indirection even inside libknet core. Efficiency surely suffers more due to my ops structure than this. But again I find that straightforward, so maybe worth it... It's a pity it has to go through a void * in load_module().

Yeah.. I don´t have a strong opinion. Chrissie? do you mind to be quorum discriminator here? ;)

...

...
PS I love the work on configure.ac, that´s definitely going in.

I spent quite some time trying to avoid introducing new defines, but no preprocessor magic I could come up with cut it. On the other hand, the original BUILD* macros now see very little use (two tests only). Splitting those tests would let us get rid of the defines and use the identically named Automake conditionals exclusively. Which would be a nice cleanup in my opinion.

Sure, go for it.

Fabio

Fabio M. Di Nitto

4:59 p.m.

New subject: [kronosnet/kronosnet] a2f08e: Switch over all plugins to the module system

On 11/30/2017 5:49 PM, Fabio M. Di Nitto wrote:

...

On 11/30/2017 5:33 PM, Ferenc Wágner wrote:

...
"Fabio M. Di Nitto" fdinitto@redhat.com writes:

...
On 11/30/2017 4:54 PM, Ferenc Wágner wrote:

...
"Fabio M. Di Nitto" fdinitto@redhat.com writes:

...
On 11/30/2017 3:26 PM, Ferenc Wágner wrote:

...
You made me curious. Let me try to implement it... OK, I rebased and squashed my branch somewhat. I'm pushing it as modules2, we had some very similar ideas and some different ones!

can you please rebase it on top of the latest modules branch? it´s missing / reverting the latest spec file changes.

Not easily, but I cherry-picked them, hope it helps.

Ack, not a problem. We might need to cherry pick a bit more (like the valgrind exception for BSD), but let see.

Done, thanks. I hope this will help the BSD builds...

...
Also, I see you are #define KNET_MODULE for each module to handle the log_msg and you need to pass log_msg symbols around on each load.

I am not against that patch, but i think just passing the pointer around via knet_h is less invasive and needs less "magic" :-)

Either way works for me, I don't know enough about knet_h to tell whether a pointer to log_msg belongs there. Mainly, I wanted to try and offer this approach. Since it's a fixed value, storing it separately in each handle seems a little awkward.

Yeah I see your point. knet_h is completely hidden from the user, so it doesn´t matter what we store there internally (so to speak).

Most applications will have one or two handles, I doubt we will see a proliferation of many handles :-)

...
BTW it's possible to get rid of the KNET_MODULE defines by using indirection even inside libknet core. Efficiency surely suffers more due to my ops structure than this. But again I find that straightforward, so maybe worth it... It's a pity it has to go through a void * in load_module().

Yeah.. I don´t have a strong opinion. Chrissie? do you mind to be quorum discriminator here? ;)

...
...
PS I love the work on configure.ac, that´s definitely going in.

I spent quite some time trying to avoid introducing new defines, but no preprocessor magic I could come up with cut it. On the other hand, the original BUILD* macros now see very little use (two tests only). Splitting those tests would let us get rid of the defines and use the identically named Automake conditionals exclusively. Which would be a nice cleanup in my opinion.

Sure, go for it.

actually, even easier, just check if the plugin is built with knet_handle_get_(compress|crypto)_list and do or skip the test.

Fabio

Christine Caulfield

1 Dec 1 Dec

11:07 a.m.

New subject: [kronosnet/kronosnet] a2f08e: Switch over all plugins to the module system

On 30/11/17 16:49, Fabio M. Di Nitto wrote:

...

On 11/30/2017 5:33 PM, Ferenc Wágner wrote:

...
"Fabio M. Di Nitto" fdinitto@redhat.com writes:

...
On 11/30/2017 4:54 PM, Ferenc Wágner wrote:

...
"Fabio M. Di Nitto" fdinitto@redhat.com writes:

...
On 11/30/2017 3:26 PM, Ferenc Wágner wrote:

...
You made me curious. Let me try to implement it... OK, I rebased and squashed my branch somewhat. I'm pushing it as modules2, we had some very similar ideas and some different ones!

can you please rebase it on top of the latest modules branch? it´s missing / reverting the latest spec file changes.

Not easily, but I cherry-picked them, hope it helps.

Ack, not a problem. We might need to cherry pick a bit more (like the valgrind exception for BSD), but let see.

Done, thanks. I hope this will help the BSD builds...

...
Also, I see you are #define KNET_MODULE for each module to handle the log_msg and you need to pass log_msg symbols around on each load.

I am not against that patch, but i think just passing the pointer around via knet_h is less invasive and needs less "magic" :-)

Either way works for me, I don't know enough about knet_h to tell whether a pointer to log_msg belongs there. Mainly, I wanted to try and offer this approach. Since it's a fixed value, storing it separately in each handle seems a little awkward.

Yeah I see your point. knet_h is completely hidden from the user, so it doesn´t matter what we store there internally (so to speak).

Most applications will have one or two handles, I doubt we will see a proliferation of many handles :-)

...
BTW it's possible to get rid of the KNET_MODULE defines by using indirection even inside libknet core. Efficiency surely suffers more due to my ops structure than this. But again I find that straightforward, so maybe worth it... It's a pity it has to go through a void * in load_module().

Yeah.. I don´t have a strong opinion. Chrissie? do you mind to be quorum discriminator here? ;)

I vote for Feri's solution. To me it makes more sense than indirecting the logging for everything via a function indirect, finding the logging function for yourself is normal for modules IME.

Fabio has also ACKED this from the airport:)

Chrissie

...

...
...
PS I love the work on configure.ac, that´s definitely going in.

I spent quite some time trying to avoid introducing new defines, but no preprocessor magic I could come up with cut it. On the other hand, the original BUILD* macros now see very little use (two tests only). Splitting those tests would let us get rid of the defines and use the identically named Automake conditionals exclusively. Which would be a nice cleanup in my opinion.

Sure, go for it.

Fabio

wferi＠niif.hu

28 Nov 28 Nov

5:23 p.m.

New subject: [kronosnet/kronosnet] a2f08e: Switch over all plugins to the module system

"Fabio M. Di Nitto" fdinitto@redhat.com writes:

...

On 11/28/2017 10:00 AM, Ferenc Wágner wrote:

...
"Fabio M. Di Nitto" fdinitto@redhat.com writes:

...
let me stop you a minute here.

Thanks for looking at this experiment proactively! It reached the point of viability right now, so let's chat about them if you have the time.

Yeps, I have got time :-)

Apparently more than me... I'm at home with a sick child, sorry for the slow reaction.

...

...
...
what is the end goal of this work?

I tried to summarize it in 96a92847:
Our current practice of dlopening foreign shared libraries is problematic
for several reasons:
* not portable: modules and shared libraries can be different object types
* dependency information is invisible (our canaries mostly solve this)
* hardwiring SONAMES breaks on transitions (KNET_PKG_SONAME solves this)
* symbol versioning information is lost (theoretically solvable)

The preferred way out is generating dynamically loaded private modules
from the main source, which then rely on the dynamic linker to load the
external symbols as usual.
For a longer version please refer to my mail at https://lists.debian.org/debian-mentors/2017/11/msg00200.html and the links included by Guillem Jover.
Most of those concerns appears to be coming from a packing perspective. I understand them, and some of them are valid.

I think it's a sharp observation. While knet is evolving faster than all the plugin dependencies taken together, keeping dlopen at the outer boundary can be advantageous.

...

...
...
We originally didn´t implement the modules model because it has several downsides for little benefit, based on the experience we had with corosync modules in the past.

I'd be interested to learn about these downsides, could you please provide some hints or keywords to search for?

I don´t think we have recorded them, but here is the rundown:

technically speaking using modules is no different than dlopening a

shared library.

Yes, on ELF platforms that's true.

...

In that respect, we are simply moving the problem somewhere else. I could agree (based on the threads above) that containing the problem within the same upstream is probably saner.

It's the only way I know of to avoid hardwiring foreign library sonames. Which differ from platform to platform. Detecting them (as we currently do) works well enough in simple cases, but NSS is a pig already, having its symbols spread into several shared objects. That would be rather complicated to emulate with dlopens, not to mention symbol versioning. Honestly, I haven't had the courage to dwelve into that yet.

...

it enforces a strict internal API/ABI between main libknet and the

plugins. Making changes to those very complex, specially during updates (see also point 4). Right now we want to keep those API/ABI free to change and expand.

I'd think keeping both sides in the same upstream makes the internal API/ABI a non-issue at the upstream level. Packaging can use strict versioned dependencies between the plugin packages and the main library to force upgrading them together (this only exposes the need to restart the application after library/plugin upgrades). Other solutions are module versioning (like library versioning, but independent) or plugin directory versioning, which enable coexistence of different module ABI versions (again, the module ABI could change keeping the library ABI undisturbed). See also below.

...

it´s very difficult to debug modules due the symbol resolving

mechanism (Honza in CC can provide more details).

For what it's worth, gdb works fine for me...

wferi@lant:~/ha/kronosnet/kronosnet/libknet/tests$ LD_LIBRARY_PATH=../.libs gdb .libs/int_crypto_test [...] Reading symbols from .libs/int_crypto_test...done. (gdb) b encrypt_nss Function "encrypt_nss" not defined. Make breakpoint pending on future shared library load? (y or [n]) y Breakpoint 1 (encrypt_nss) pending. (gdb) r Starting program: /home/wferi/ha/kronosnet/kronosnet/libknet/tests/.libs/int_crypto_test [Thread debugging using libthread_db enabled] Using host libthread_db library "/lib/x86_64-linux-gnu/libthread_db.so.1". [New Thread 0x7ffff5054700 (LWP 1856)] [New Thread 0x7ffff4853700 (LWP 1857)] [New Thread 0x7ffff4052700 (LWP 1858)] [New Thread 0x7ffff3851700 (LWP 1859)] [New Thread 0x7ffff3050700 (LWP 1860)] [New Thread 0x7ffff284f700 (LWP 1861)] [New Thread 0x7ffff204e700 (LWP 1862)] Test knet_handle_crypto with nss/aes128/sha1 and normal key knet logs: [info] common: crypto_nss.so has been loaded from /home/wferi/ha/kronosnet/kronosnet/libknet/tests/../.libs/crypto_nss.so knet logs: [debug] crypto: Initizializing crypto module [nss/aes128/sha1] knet logs: [debug] nsscrypto: Initizializing nss crypto module [aes128/sha1] knet logs: [debug] crypto: security network overhead: 68 Source Data: Encrypt me!

Thread 1 "int_crypto_test" hit Breakpoint 1, nsscrypto_encrypt_and_signv (iovcnt_in=1, buf_out_len=0x7fffffffd0b0, buf_out=0x5555579090e0 "", iov_in=0x7fffffffd030, knet_h=0x7ffff5055010) at crypto_nss.c:625 625 if (encrypt_nss(knet_h, iov_in, iovcnt_in, buf_out, buf_out_len) < 0) { (gdb)

...

#3 i am not happy about. specially given that the project is still in it´s "early" days. this specifically has proven very challenging in some environments.

Nothing threatening comes to my mind at the moment, but I'd be happy to look into any concrete problems brought up.

...

runtime operations (updates) can be nasty. You could have

application X that has loaded libknet 1.0 with modules 1.0 apt-get update.. get libknet 1.1 or whatever, new plugins, application tries to load a module and kaboom.

Well, yes. But at least it's in one hand: with modules you've got a small internal ABI to watch out for, with direct dlopens you've got several foreign ABIs to follow. Yes, development speed matters, but as knet matures, the balance will inevitably tip, if it hasn't already.

...

#4 can be solved by adding some kind of hashing/signing mechanism of the modules (aka load only modules that match the build).

That sounds somewhat overkill... why not just use a version number if we really must? Or introduce an extensible module ABI (with an explicit size at the front of the model definition) and stay safe for a longer term? Just thinking out loudly...

...

As for the code I have seen so far, please find another way to pass log_msg down to the plugins. I am not going to export it to the world :-)

Yeah, that's a wart. I don't know how to make that symbol accessible in the usual way in the modules without including its code. And I wonder what other internal symbols may need to be exposed later. Does the format used in the log pipe constitute ABI? This might even be a show stopper.

-- Thanks, Feri

Fabio M. Di Nitto

6:49 p.m.

New subject: [kronosnet/kronosnet] a2f08e: Switch over all plugins to the module system

On 11/28/2017 6:23 PM, Ferenc Wágner wrote:

...

"Fabio M. Di Nitto" fdinitto@redhat.com writes:

...
On 11/28/2017 10:00 AM, Ferenc Wágner wrote:

...
"Fabio M. Di Nitto" fdinitto@redhat.com writes:

...
let me stop you a minute here.

Thanks for looking at this experiment proactively! It reached the point of viability right now, so let's chat about them if you have the time.

Yeps, I have got time :-)

Apparently more than me... I'm at home with a sick child, sorry for the slow reaction.

No worries. I have got 2 kids as well. Family first.

...

...
...
...
what is the end goal of this work?

I tried to summarize it in 96a92847:
Our current practice of dlopening foreign shared libraries is problematic
for several reasons:
* not portable: modules and shared libraries can be different object types
* dependency information is invisible (our canaries mostly solve this)
* hardwiring SONAMES breaks on transitions (KNET_PKG_SONAME solves this)
* symbol versioning information is lost (theoretically solvable)

The preferred way out is generating dynamically loaded private modules
from the main source, which then rely on the dynamic linker to load the
external symbols as usual.
For a longer version please refer to my mail at https://lists.debian.org/debian-mentors/2017/11/msg00200.html and the links included by Guillem Jover.
Most of those concerns appears to be coming from a packing perspective. I understand them, and some of them are valid.
I think it's a sharp observation. While knet is evolving faster than all the plugin dependencies taken together, keeping dlopen at the outer boundary can be advantageous.

At the end, if any of the external API will change, we will notice one way or another. That is why we have the daily CI job running _exactly_ to detect breakage generated by our build dependencies.

https://ci.kronosnet.org/job/knet-all-daily/

https://docs.google.com/document/d/1q6OZD97H8ZF1WEDTJq6p84dR-6AuL_HrxDLssSdl...

...

...
...
...
We originally didn´t implement the modules model because it has several downsides for little benefit, based on the experience we had with corosync modules in the past.

I'd be interested to learn about these downsides, could you please provide some hints or keywords to search for?

I don´t think we have recorded them, but here is the rundown:

technically speaking using modules is no different than dlopening a

shared library.

Yes, on ELF platforms that's true.

...
In that respect, we are simply moving the problem somewhere else. I could agree (based on the threads above) that containing the problem within the same upstream is probably saner.

It's the only way I know of to avoid hardwiring foreign library sonames. Which differ from platform to platform. Detecting them (as we currently do) works well enough in simple cases, but NSS is a pig already, having its symbols spread into several shared objects. That would be rather complicated to emulate with dlopens, not to mention symbol versioning. Honestly, I haven't had the courage to dwelve into that yet.

See my last email. Let´s move to module and solve the problem at once.

...

...

it enforces a strict internal API/ABI between main libknet and the

plugins. Making changes to those very complex, specially during updates (see also point 4). Right now we want to keep those API/ABI free to change and expand.

I'd think keeping both sides in the same upstream makes the internal API/ABI a non-issue at the upstream level. Packaging can use strict versioned dependencies between the plugin packages and the main library to force upgrading them together (this only exposes the need to restart the application after library/plugin upgrades).

Packaging only helps you to maintain ondisk compatibility between version tho. the main library might be loaded by an application and not restarted on package upgrade.

...

Other solutions are module versioning (like library versioning, but independent) or plugin directory versioning, which enable coexistence of different module ABI versions (again, the module ABI could change keeping the library ABI undisturbed). See also below.

right, same as I suggested in the other email.

...

...

it´s very difficult to debug modules due the symbol resolving

mechanism (Honza in CC can provide more details).

For what it's worth, gdb works fine for me...

wferi@lant:~/ha/kronosnet/kronosnet/libknet/tests$ LD_LIBRARY_PATH=../.libs gdb .libs/int_crypto_test [...] Reading symbols from .libs/int_crypto_test...done. (gdb) b encrypt_nss Function "encrypt_nss" not defined. Make breakpoint pending on future shared library load? (y or [n]) y Breakpoint 1 (encrypt_nss) pending. (gdb) r Starting program: /home/wferi/ha/kronosnet/kronosnet/libknet/tests/.libs/int_crypto_test [Thread debugging using libthread_db enabled] Using host libthread_db library "/lib/x86_64-linux-gnu/libthread_db.so.1". [New Thread 0x7ffff5054700 (LWP 1856)] [New Thread 0x7ffff4853700 (LWP 1857)] [New Thread 0x7ffff4052700 (LWP 1858)] [New Thread 0x7ffff3851700 (LWP 1859)] [New Thread 0x7ffff3050700 (LWP 1860)] [New Thread 0x7ffff284f700 (LWP 1861)] [New Thread 0x7ffff204e700 (LWP 1862)] Test knet_handle_crypto with nss/aes128/sha1 and normal key knet logs: [info] common: crypto_nss.so has been loaded from /home/wferi/ha/kronosnet/kronosnet/libknet/tests/../.libs/crypto_nss.so knet logs: [debug] crypto: Initizializing crypto module [nss/aes128/sha1] knet logs: [debug] nsscrypto: Initizializing nss crypto module [aes128/sha1] knet logs: [debug] crypto: security network overhead: 68 Source Data: Encrypt me!

Thread 1 "int_crypto_test" hit Breakpoint 1, nsscrypto_encrypt_and_signv (iovcnt_in=1, buf_out_len=0x7fffffffd0b0, buf_out=0x5555579090e0 "", iov_in=0x7fffffffd030, knet_h=0x7ffff5055010) at crypto_nss.c:625 625 if (encrypt_nss(knet_h, iov_in, iovcnt_in, buf_out, buf_out_len) < 0) { (gdb)

Let´s see how it goes. Apparently the plugin module used in corosync was much more complex and caused problems that theoretically we should not see here (according to Jan).

...

...
#3 i am not happy about. specially given that the project is still in it´s "early" days. this specifically has proven very challenging in some environments.

Nothing threatening comes to my mind at the moment, but I'd be happy to look into any concrete problems brought up.

agreed.

...

...

runtime operations (updates) can be nasty. You could have

application X that has loaded libknet 1.0 with modules 1.0 apt-get update.. get libknet 1.1 or whatever, new plugins, application tries to load a module and kaboom.

Well, yes. But at least it's in one hand: with modules you've got a small internal ABI to watch out for, with direct dlopens you've got several foreign ABIs to follow. Yes, development speed matters, but as knet matures, the balance will inevitably tip, if it hasn't already.

...
#4 can be solved by adding some kind of hashing/signing mechanism of the modules (aka load only modules that match the build).

That sounds somewhat overkill... why not just use a version number if we really must? Or introduce an extensible module ABI (with an explicit size at the front of the model definition) and stay safe for a longer term? Just thinking out loudly...

Yeah we got to the same conclusion. See again my other reply :-)

...

...
As for the code I have seen so far, please find another way to pass log_msg down to the plugins. I am not going to export it to the world :-)

Yeah, that's a wart. I don't know how to make that symbol accessible in the usual way in the modules without including its code. And I wonder what other internal symbols may need to be exposed later. Does the format used in the log pipe constitute ABI? This might even be a show stopper.

It´s no different than installing callback. See for example:

int knet_handle_enable_pmtud_notify(knet_handle_t knet_h, void *pmtud_notify_fn_private_data, void (*pmtud_notify_fn) ( void *private_data, unsigned int data_mtu));

The module init would need to have an extra parameter to pass log_msg in the void (*log_msg_fn) (.....));

At init time, it will store the pointer to log_msg somewhere internal and it needs to do that only once. Should be fairly straightforward.

Fabio

2916

Age (days ago)

2919

Last active (days ago)

devel@lists.kronosnet.org

18 comments

4 participants

tags (0)

participants (4)

Christine Caulfield
Fabio M. Di Nitto
Fabio M. Di Nitto
wferi＠niif.hu