2008-05-21

What is rpath?

http://www.the-martins.org/index.php?name=Sections&req=viewarticle&artid=6&allpages=1


Recently I was working on a cross-platform build and had to understand the concept of rpath and how it works on various platforms. I had trouble locating a concise source of information about rpath, so I decided to write a little bit of what I learned here in hopes it will help someone in the future. This article assumes an understanding of static vs. dynamic linking and basic understanding of the concepts around ld.so and possibly other runtime linkers.

Dynamically linked applications need help from the system dynamic linker in order to satisfy their runtime dependencies. For example, if I link an executable against "libfoo.so", when I go to run that executable the dynamic linker is going to try to find that library for me to link to.

On linux, ld.so follows some rules to find libraries. The most common case is that ld.so searches the system library directories, as defined by /etc/ld.so.conf. If the library in question is found all is good. However, if libfoo.so is not in a system library directory, we have an "escape hatch" that we can use to ask ld.so to look in other places. Specifically, the environment varilable "LD_LIBRARY_PATH" is consulted and directories listed there will be searched as well. One common use of this variable is to allow normal users to compile and install libraries into their accounts or other non-standard places.

A less common technique (and less documented it seems) is to use something called an RPATH. RPATH is like LD_LIBRARY_PATH in that it tells the runtime linker "look in this directory for shared libraries". The key difference between RPATH and LD_LIBRARY_PATH is that RPATH is written into the binary itself
it's not an environment variable. On Linux, the RPATH of a binary can be set using an argument to the linker
"-rpath" (or "--rpath" seems to work too).

In general calling the linker directly is a bad idea; most of the time you want to link with the compiler. (It knows more about the enivronment than ld and will do a lot of "dirty work" for you.) With gcc you can pass linker flags when you're using it to link - the full set of flags is "-We,--rpath=:" etc.

If you want to see the RPATH that is compiled into a binary, you can use the program "objdump" - specificially "objdump -p | grep RPATH" will show you the RPATH if it is defined. This is a fairly uncommon thing to do so there is a good chance you won't find RPATH defined on system binaries in your linux system.

Next I'll talk about when RPATH is a good idea.
What are the advantages of RPATH over other solutions? On many applications on Linux systems, RPATH is not interesting. You generally have the source, or you can choose to install stuff into the system paths. Often if one of these choices doesn't apply, LD_LIBRARY_PATH will suffice.

But LD_LIBRARY_PATH can be problematic, too. One problem with it is that using it requires it to be defined (duh) and how to do that isn't always as simple as you might like. A common approach in applications that ship as binaries is to have a wrapper shell script that defines LD_LIBRARY_PATH before running the real executable. In my experience, this approach tends to be fragile and problematic.

Developers wanting to ship binary applications on Linux face a number of challenges. Due to efforts like the LSB Project, great strides have been made in trying to make distributions look similiar in directory layout and to deal with binary compatibility issues. Binary incompatiability can be caused by many things: - Compiler version differences - particulary for C++ due to ABI incompatibility - Library version differences, including libc and other core libraries - Differences in the options chosen when compiling libraries. SSL support? Freetype support?

One solution is to ship some kind of chroot environment with the binary in question. RPATH can simplify achieving this solution. Writing an RPATH into a binary or library is like being able to encode an LD_LIBRARY_PATH entry into the binary itself. Properly used, it can obviate the need for a shell script wrapper, at least for library path information.

No comments: