LD_LIBRARY_PATH considered harmful
The purpose of the LD_LIBRARY_PATH
environment variable is to instruct the
linker to consider additional directories when searching for libraries. Its
valid use case is the test of alternative library versions installed in
non-standard locations. In contrast to that, globally setting the
LD_LIBRARY_PATH
(e.g. in the profile of a user) is harmful because there is
no setting that fits every program. The directories in the LD_LIBRARY_PATH
environment variable are considered before the default ones and the ones
specified in the binary executable. Thus, a - say - system
command that is supposed to use a system library easily gets linked at runtime
with an API incompatible version. Also, having a program that relies on a
certain LD_LIBRARY_PATH
setting creates the maintenance burden to always
accurately document that setting and distribute that documentation with the
binary. Instead, to avoid these issues, the additional directories (if any)
that should be searched by the runtime linker should be specified via linker
options (e.g. -rpath
or -R
) at build-time. This results in those
directories being written to an ELF attribute that is considered by the runtime
linker (i.e. the runpath).
Related¶
Usenet discussions about the miss-use of LD_LIBRARY_PATH
go back as early as
1993. In 1994, Casper H.s. Dik (who later posted as Sun engineer)
concludes his answer in comp.unix.solaris
with 'LD_LIBRARY_PATH
: just say
no'. For context, the first Solaris version that comes with ELF executables and
shared libraries seems to be Solaris 2.0, which was released 1992. Linux
supports ELF since 1995.
Around 1999, David Barr published the article Why LD_LIBRARY_PATH
is bad.
It has two examples that detail how LD_LIBRARY_PATH
causes harm, motivates
valid uses and describes better alternative ways already available on Solaris
7. The page LD_LIBRARY_PATH
Is Not The Answer references David Barr's
article and calls globally setting the LD_LIBRARY_PATH
a 'complete hack'.
Also referenced by this page is Rod Evans' 2004 blog post LD_LIBRARY_PATH
- just
say no. He, as a Sun employee at that time - in his sun.com blog, details on
the sister variables LD_LIBRARY_PATH_32
and LD_LIBRARY_PATH_64
that are
also available on Solaris, in addition to LD_LIBRARY_PATH
. His acroread example shows
how they can complicate the situation such that even more harm is delivered.
His conclusion also is to use the runpath and where necessary to make use of
the $ORIGIN
linker variable - a variable that is substituted by the runtimee linker
with the path where the executable is located. Similar to this example
is the war story Purging LD_LIBRARY_PATH
written 2010 by Joseph D. Darcy
on his Oracle blog. He describes the 'messy' way the JDK used and manipulated
LD_LIBRARY_PATH
until version 7. Again, $ORIGIN
is found to be a better
alternative mechanism for that use case. Another (then) Sun
colleague Ali Bahrami follows up on Evans with Avoiding
LD_LIBRARY_PATH
: The Options, in 2007. He calls LD_LIBRARY_PATH
a
'crude tool' and argues that it is probably the '#1 one way to
get yourself into trouble in an ELF environment'. As an
alternative he describes the elfedit
tool available in Solaris
11 and later Solaris 10 patch levels.
The Shared Library HowTo also references David Barr's article and concludes that it
is handy for development and testing, but shouldn't be modified by an installation process for normal use by normal users.
As an
alternative, it includes an example how (on Linux) the runtime linker
/lib/ld-linux.so.2
can be explicitly invoked for executing a given binary
using an alternative search path.
The Sun Studio 12 Fortran Programming Guide (!) warns about
using the LD_LIBRARY_PATH
for anything but test scenarios:
Use of the
LD_LIBRARY_PATH
environment variable with production software is strongly discouraged. Although useful as a temporary mechanism for influencing the runtime linker’s search path, any dynamic executable that can reference this environment variable will have its search paths altered. You might see unexpected results or a degradation in performance.
(emphasis theirs)
Linux distributions usually don't include any package that relies on a certain
LD_LIBRARY_PATH
setting - they install the packaged libraries into the
standard locations. But even a package distribution like OpenCSW (that
installs all its packages into a non-standard path) has a policy against
LD_LIBRARY_PATH
for all the right reasons:
It is not necessary to set it for OpenCSW binaries. All of them are built with the -R flag, so each binary itself knows where to look for the shared objects.
You do not need to set
LD_LIBRARY_PATH
system-wide; and if you do, you will likely break your system, even to the point of locking yourself out. Some of the library names clash between /usr/lib and/opt/csw/lib
, and if you run the Solaris openssh daemon withLD_LIBRARY_PATH
set to/opt/csw/lib
,/usr/lib/ssh/sshd
will try to loadlibcrypto
from/opt/csw/lib
and fail to start.
They also reference Rod Evan's blog article.
The title of this article is inspired by the considered harmful meme. See
for example Go To Statement Considered Harmful and Recursive Make
Considered Harmful. As with the LD_LIBRARY_PATH
that has legitimate uses,
goto has them as well, cf. Structured Programming with go to (Knuth,
1974).
Possible Roots¶
Looking at the documented harmfulness of LD_LIBRARY_PATH
one might wonder why
it is popular in certain circles. One reason probably can be traced back to the
standard install note printed when installing a package that uses
Libtool (often used with Autoconf/Automake):
----------------------------------------------------------------------
Libraries have been installed in:
$PREFIX/lib
If you ever happen to want to link against installed libraries
in a given directory, LIBDIR, you must either use libtool, and
specify the full pathname of the library, or use the `-LLIBDIR'
flag during linking and do at least one of the following:
- add LIBDIR to the `LD_LIBRARY_PATH' environment variable
during execution
- add LIBDIR to the `LD_RUN_PATH' environment variable
during linking
- use the `-Wl,-rpath -Wl,LIBDIR' linker flag
- have your system administrator add LIBDIR to `/etc/ld.so.conf'
See any operating system documentation about shared libraries for
more information, such as the ld(1) and ld.so(8) manual pages.
----------------------------------------------------------------------
This note contains two bad advices:
- the use of
LD_LIBRARY_PATH
- the use of
LD_RUN_PATH
(which is in effect similar toLD_LIBRARY_PATH
but only considered if the-rpath
option isn't supplied)
It is unfortunate because the ramifications of the alternatives aren't qualified and LD_LIBRARY_PATH
is even mentioned first.
Thus, a developer or sysadmin who doesn't know much about linking might be
tempted to see the LD_LIBRARY_PATH
as THE standard way and because it works for
one package then wrongly internalize that as this-is-how-it-is-done-on-unix.
In addition to that, some vendors that distribute binary executables and libraries just give bad advice in their install instructions. For example Oracle, the well-known 'enterprise' DB vendor:
Add the name of the directory containing the Instant Client libraries to
LD_LIBRARY_PATH
.
(SQLPlus® User's Guide and Reference, Oracle 11g2, Configuring SQLPlus Instant Client)
Before you can connect Instant Client (including Instant Client Light) to an Oracle database, ensure that the
LD_LIBRARY_PATH
environment variable specifies the directory that contains the Instant Client libraries.
(Database Client Installation Guide, Oracle 11g2, Recommended Postinstallation Tasks)
The
instantclient_12_1
directory must be on theLD_LIBRARY_PATH
before linking the application.
(Oracle C++ Call Interface Programmer's Guide, Oracle 12c, Installation and Upgrading)
Last but not least, a quick google search regarding some
cannot-start-program-library-not-found error might turn up low quality forum posts,
where setting the LD_LIBRARY_PATH
is suggested.
Harm at compile time¶
At compile time, the linker ld
is usually called by the compiler such that
all object files a binary executable (or library) consists of are linked
together and dependent libraries are referenced. How LD_LIBRARY_PATH
influences the linking differs on Linux and Solaris.
Linux¶
The LD_LIBRARY_PATH
directories aren't considered when ld
searches for
libraries specified via -l
. But, the LD_LIBRARY_PATH
is considered when
shared library dependencies of linked shared libraries are resolved (cf.
-rpath-link in ld(1)). In that case, the LD_LIBRARY_PATH
directories
are searched after the ones specified with -rpath-link
and -rpath
but
before ones specified by ELF attributes and the default ones (e.g. /lib
and /usr/lib
).
Solaris¶
On Solaris, in contrast to Linux, the LD_LIBRARY_PATH
directories are
searched by ld
when searching for libraries specified via -l
. Those
directories are appended to the search path resulting from any -L
option
Makefiles¶
Even if the LD_LIBRARY_PATH
is not globally set, it still may be in effect
because a poorly written makefile assigns this environment variable.
Also, when make
is called from an IDE (like emacs) the environment of that
process is inherited - thus, an LD_LIBRARY_PATH
setting in the start script
of that IDE may induce harm.
Harm at runtime¶
At runtime, the runtime linker (e.g. on Linux this is ld.so
) searches the
LD_LIBRARY_PATH
directories before the ones specified by the
DT_RUNPATH
ELF attribute and the before default ones.
On Linux, the DT_RPATH
ELF attribute (which is documented as deprecated) is
considered before the LD_LIBRARY_PATH
, if and only if the binary doesn't also
has the DT_RUNPATH
attribute set. In that case the DT_RPATH
ELF attribute
is ignored.
The writing of these two ELF attributes is system dependent:
System | Compiler switch | ELF attribute |
---|---|---|
Linux | -Wl,-rpath,SOMEDIR |
DT_RPATH = SOMEDIR |
Linux | -Wl,-RSOMEDIR |
DT_RPATH = SOMEDIR |
Linux | -Wl,--enable-new-dtags,-rpath,SOMEDIR |
DT_RUNPATH = SOMEDIR |
Solaris | -RSOMEDIR |
DT_RUNPATH = DT_RPATH = SOMEDIR |
Solaris | -Wl,-RSOMEDIR |
DT_RUNPATH = DT_RPATH = SOMEDIR |
Solaris | -Wl,-rpath,SOMEDIR |
DT_RUNPATH = DT_RPATH = SOMEDIR |
Note that:
- The compiler option
-Wl
instructs the compiler to pass the option following the first comma directly to the linker. All following commas are interpreted as argument delimiter. - On Linux,
-Wl-path,SOMEDIR
and-Wl,-RSOMEDIR
is equivalent due to option parsing magic - for compatibility reasons-R
is overloaded. If the argument of-R
is a filename the option has a different effect. - On Solaris, the compiler and the linker both understand
-R
such that-Wl,-R
is equivalent to-Wl,-rpath,SOMEDIR
- The Solaris 10
ld
also understands-rpath
although this isn't documented in all versions of the SunOS 5.10 ld(1) man page. It is documented in the 2011 version of that page, though.
Conclusion¶
Globally setting the LD_LIBRARY_PATH
is never a good idea. The
narrow original use case of LD_LIBRARY_PATH
are quick tests of alternate libraries.
When dealing with properly created executables setting the
LD_LIBRARY_PATH
is redundant in the best case, but it breaks
things in the common case. There are better mechanisms and tools
than LD_LIBRARY_PATH
available to instruct the linker how to
search for the correct libraries at build-time and at runtime.
Recommendations¶
Verify Environment Settings¶
Verify that in fact LD_LIBRARY_PATH
(or it variants
LD_LIBRARY_PATH_32
, LD_LIBRARY_PATH_64
or LD_RUN_PATH
)
isn't globally set via shell run control files like
/etc/profile
, /etc/bashrc
or something like that. Also check
that it isn't set in user dotfiles like ~/.bashrc
, ~/.profile
etc. Such a setting would be bad in a config of a development user,
but exorbitantly more so in the profile of a production user.
If any running process still has the LD_LIBRARY_PATH
set can be
verified via looking at its environment. For example, on Linux
via /proc
:
$ < /proc/$SOMEPID/environ tr '\0' '\n' | grep LD_LIBRARY_PATH
Or on Solaris via pargs
:
$ pargs -e $SOMEPID | grep LD_LIBRARY_PATH
Set the runtime library search path at build time¶
Analogously to the -LSOMEDIR
option that adds a directory to the build-time
library search path, the option -Wl,-rpath,SOMEDIR
(or -Wl,-RSOMEDIR
) adds a
directory to the runtime library search path. That means that the
resulting path is written by the linker into the DT_RPATH
and/or DT_RUNPATH
ELF attribute of the resulting binary.
The thus set attributes can be printed on Linux via readelf
, e.g.
$ readelf -d my_binary_or_so | grep PATH
0x000000000000001d (RUNPATH) Library runpath: [SOMEDIR]
And on Solaris via elfdump
:
$ elfdump my_binary_or_so | grep PATH
[4] RUNPATH 0x128 SOMEDIR
[5] RPATH 0x128 SOMEDIR
Other interesting attributes dumped by those tools and that are relevant in
this context are NEEDED
(i.e. the dependent shared libraries specified via
-l
or as absolute path) and SONAME
(i.e. the name of a shared library
that is copied from the library to the NEEDED
attribute of the binary that
depends on that library).
The effect of different runpaths and specified libraries can be verified via calling
ldd my_binary_or_so
. When it outputs contains lines like
libxyz.so => not found
on Linux, or on Solaris
libxyz.so => (file not found)
then the path is still incomplete or incorrect and the runtime linker will abort the program start with a message like this
./my_binary: error while loading shared libraries: libxyz.so: cannot open shared object file: No such file or directory
on Linux and on Solaris:
ld.so.1: my_binary: fatal: libxyz.so: open failed: No such file or directory
Killed
The Linux runtime linker exits with exit status 127, while the Solaris runtime linker exits with 137.
When the start of a binary has succeeded one can verify the actually runtime linked libraries via pldd
, which is available on Linux and Solaris. Or, as an alternative, via lsof
.
Use $ORIGIN
¶
The linker variable $ORIGIN
is expanded by the runtime linker
with the current 'origin' of the ELF binary.
The origin is the directory where the binary is stored.
Thus, using this variable in a path specified via
-Wl,-rpath,SOMEDIR
(or via -Wl,-RSOMEDIR
) allows for path
specifications that are relative to the location of the binary.
The obvious usecase are binaries that are supposed to be installed inside a non-standard prefix (e.g. /opt/foo
) with some of its needed libraries. The directory could be then specified like this (in a shell):
-Wl,-rpath,'$ORIGIN/../lib64'
Note that the linker variable $ORIGIN
is enclosed in single
quotes so that it is not expanded by the shell (e.g. bash).
When using this in a makefile, in addition to the single quotes, the dollar sign has to be escaped such that make doesn't expand it as make variable:
-wl,-rpath,'$$ORIGIN/../lib64'
The linker variable $ORIGIN
is understood by the Linux and by
the Solaris runtime linker.
On Linux, the runtime linker also expands a few other variables.
On Solaris, use -Wl,-i
¶
The option -Wl,-i
instructs the linker to ignore any
LD_LIBRARY_PATH
environment variable. Thus, this variable can
be used as safety net in case LD_LIBRARY_PATH
accidentally is
still set.
Unfortunately, ld
on Linux interprets -i
differently
(i.e. as: link incrementally).
Patch existing ELF binaries¶
In case one doesn't have access to the source code, it is still
an option to rewrite the DT_RPATH
and/or DT_RUNPATH
attributes
of an exisiting ELF binary. The tool patchelf
supports this.
For example, to fix some Oracle executables and the client library that are part of the Oracle 11g2 'Instant Client' (for Linux):
$ patchelf --set-rpath '$ORIGIN/..' /path/to/instantclient_11_2/sdk/proc
$ patchelf --set-rpath '$ORIGIN' /path/to/instantclient_11_2/sqlplus
$ patchelf --set-rpath '$ORIGIN' /path/to/instantclient_11_2/libclntsh.so.11.1
The effectiveness of such changes can be verified with the usual tools, e.g.:
$ patchelf --print-rpath mybinary # or:
$ readelf -d mybinary | grep PATH
$ ldd mybinary
After the change the ldd
utility shouldn't print any 'not found' lines, anymore.
Patchelf is packaged for the major Linux distributions and should also be portable to other ELF platforms.
Solaris 11 (and apparently later Solaris 10 patch levels) come
with the tool elfedit
that can also be used to edit the
DT_RUNPATH
/DT_RPATH
ELF attributes. Example:
$ elfedit -e 'dyn:runpath $ORIGIN/lib' mybinary
However (in contrast to patchelf) it has some limitations
(cf. its man page or Changing ELF Runpaths) - e.g. such
that elfedit doesn't find enough space for path edits. Especially
with binaries created on previous Solaris 10 (or even older
Solaris) versions this is issue. Later versions reserve some
space (512 bytes it seems) at build-time - such that the room for edits is of fixed size.
Thus, it is easy to construct a path that patchelf
has no issue
to add but where elfedit
fails with:
elfedit: [0: .dynstr]: String table does not have room to add string
Also, the dependency management of Solaris 10 doesn't seem very
complete such that a system may provide elfedit
but still miss
some libraries for it:
ld.so.1: elfedit: fatal: liblddbg.so.4: version 'SUNWprivate_4.83' not found (required by file /usr/bin/elfedit)
ld.so.1: elfedit: fatal: liblddbg.so.4: open failed: No such file or directory
Obviously, when dealing with such poorly created binaries, created by an overpaid vendor, one may see this as indicator of the general quality of the provided software and service. And perhaps one reaches to the conclusion that there are better alternatives out there, built by people who know what they are doing. For our initial Oracle example the obvious alternative would be PostgreSQL. It is arguably of better quality than Oracle, implements features Oracle doesn't have and it is ridiculously easy to install (in comparison to Oracle) because it is available from the distributions package repositories.
Quarantine legacy LD_LIBRARY_PATH
settings¶
As a last resort, when re-linking or patching an existing ELF binary is
not an option one should at least restrict the scope of
LD_LIBRARY_PATH
to that binary, i.e. to a start script of that
binary.
For example, if the original legacy binary is located under
/opt/sware/bin/foo
one limits the harm of LD_LIBRARY_PATH
via
putting it in a start script like this:
$ mv /opt/sware/bin/foo /opt/sware/bin/foo.orig
$ cat <<EOF > /opt/sware/bin/foo
#!/bin/sh
export LD_LIBRARY_PATH=/opt/sware/lib
exec /opt/sware/bin/foo.orig "$@"
EOF
$ chmod 755 /opt/sware/bin/foo
Thus, its effect is limited to the legacy binary. This is a significant improvement over globally setting it.
In case the process forks any child processes, the
LD_LIBRARY_PATH
setting is inherited, though.
This can be avoided via directly invoking the runtime linker and supplying the search path as an argument. A Linux example:
$ mv /opt/sware/bin/foo /opt/sware/bin/foo.orig
$ cat <<EOF > /opt/sware/bin/foo
#!/bin/sh
exec /lib64/ld-linux-x86-64.so.2 --library-path /opt/sware/lib \
/opt/sware/bin/foo.orig "$@"
EOF
$ chmod 755 /opt/sware/bin/foo
This the runtime linker for a 64 bit binary, for a 32 bit binary
one would use /lib/ld-linux.so.2
.
A Solaris example:
$ mv /opt/sware/bin/foo /opt/sware/bin/foo.orig
$ cat <<EOF > /opt/sware/bin/foo
#!/bin/sh
exec /lib/64/ld.so.1 -e LD_LIBRARY_PATH=/opt/sware/lib \
/opt/sware/bin/foo.orig "$@"
EOF
$ chmod 755 /opt/sware/bin/foo