Stop searching for shared libraries
Aug 4th, 2022
Nix, Guix, Gentoo Prefix and Spack install every package in their own immutable prefix. These prefix directories contain a unique hash derived from the versions and flavors of the package itself and its dependencies, which ensures that multiple variants of the same package can coexist.
The non-standard directory structure makes life hard for the dynamic linker to locate libraries — but what if we don't have to search at all?
Troubles locating libraries
When running a dynamically linked executable, the dynamic linker has to find the required shared libraries in these non-standard directories. Most build systems make an effort to properly locate libraries to link to, but don't leave hints to the dynamic linker what library they actually linked to. This can be frustrating, cause some library gets linked, but at runtime this library is not found:
$ gcc -shared -o libf.so -x c - <<EOF #include <stdio.h> void f() { puts("hello world"); } EOF $ gcc -o main -x c - -L. -lf <<EOF void f(); int main() { f(); } EOF $ ./main ./main: error while loading shared libraries: libf.so: cannot open shared object file: No such file or directory
On traditional Linux distros this is typically not an issue, since libraries are installed in a default location such as /usr/lib
, and during the build you can always set LD_LIBRARY_PATH
to the build dir if you need to run something:
$ LD_LIBRARY_PATH=. ./main hello world
In fact autotools packages give various tips on how to ensure that libraries are located at runtime:
Libraries have been installed in: /opt/spack/linux-ubuntu20.04-zen/gcc-7.5.0/texinfo-6.5-uffty3xizvrgyisiaklf3dfewgvqj3oy/lib/texinfo If you ever happen to want to link against installed libraries in a given directory, LIBDIR, you must either use libtool, and specify the full pathname of the library, or use the '-LLIBDIR' flag during linking and do at least one of the following: - add LIBDIR to the 'LD_LIBRARY_PATH' environment variable during execution - add LIBDIR to the 'LD_RUN_PATH' environment variable during linking - use the '-Wl,-rpath -Wl,LIBDIR' linker flag - have your system administrator add LIBDIR to '/etc/ld.so.conf'
Requiring LD_LIBRARY_PATH
is clearly bad user experience, and relying on /etc/ld.so.conf
is not an option when multiple variants of the same library should be able to coexist.
Using rpath
In Spack, the solution is to rely on rpath
s, which are additonal search paths registered in the executable or library itself, considered by the dynamic linker before it looks in the system paths:
$ gcc -o main -x c - -L. -lf -Wl,-rpath,$PWD <<EOF void f(); int main() { f(); } EOF $ ./main hello world
Registering search paths in the binary for the binary is clearly an improvement over global search paths, and this should solve all problems, right?
Unfortunately though, the problem is not entirely solved: for a package manager it is still unclear what rpath
s to register. In Spack, the rpath
s are determined heuristically: take the prefix path of each link-type dependency, as well as the install directory of the package itself, and join the path with lib
or lib64
, since that's where libraries are typically installed.
There's another (minor) issue with rpath
s too, namely that they increase startup time. glibc
's dynamic linker uses a cache, which maps needed libraries to their install location. When setting rpath
s, this cache is not used, and in fact there are tons of stat calls. This problem has been addressed in Guix by patching glibc's loader to use a per-package loader cache.
Killing two birds with one stone
To solve both the discrepancy between the linker & dynamic loader and the “stat storm” issue, a much simpler solution is to change the soname
to the absolute path of the library after it's installed.
The soname
is an identifier that (by convention) consists of libname.so.abi-version
. The linker copies the soname
as a needed library into the dynamic section of the dependent binary:
$ gcc -shared -o libf.so.4.2.1 -x c -Wl,-soname,libf.so.1 - <<EOF #include <stdio.h> void f() { puts("hello world"); } EOF $ ln -s libf.so.4.2.1 libf.so $ ln -s libf.so.4.2.1 libf.so.1 $ ls libf.so libf.so.1 libf.so.4.2.1 $ gcc -o main -x c - -L. -lf <<EOF void f(); int main() { f(); } EOF $ readelf -d main | grep libf 0x0000000000000001 (NEEDED) Shared library: [libf.so.1] $ LD_LIBRARY_PATH=. ./main hello world
In this typical example, the linker locates the library as libf.so
which is a symlink to libf.so.4.2.1
. It copies over the soname
(which is libf.so.1
) into the executable, and subsequently the dynamic loader locates the library as libf.so.1
at runtime. Note that these version suffixes and symlinks are a convention, there is no rule to it.
Now, nothing prevents us from setting a soname
that contains a /
dir separator, the linker happily copies the soname
verbatim as a string. The dynamic loader will not search for a needed library if it contains a forward slash /
, instead it interprets it as a path and loads it directly.
And this is our way out of rpath
heuristics and “stat storms”: simply set the soname
of a library to its own absolute path upon install, and use the linker and dynamic loader in their natural way:
$ gcc -shared -o libf.so -x c -Wl,-soname,$PWD/libf.so - <<EOF #include <stdio.h> void f() { puts("hello world"); } EOF $ gcc -o main -x c - -L. -lf <<EOF void f(); int main() { f(); } EOF $ ./main hello world
Much better.
Note that rpath
s could still be useful when the executable or library actually dynamically loads libraries with dlopen
. Fortunately non-standard soname
s are not an issue for dlopen(filename, ...)
: it simply locates the library by filename.