Thursday, July 19, 2012

Creating portable Linux binaries

For some, the idea of creating a portable Linux binary is somewhat elusive.

In this article, we will be discussing how to create a Linux binary for a specific architecture, that you will have great success running on a large variety of Linux distros. This includes current releases, somewhat old ones, and hopefully far into the future.

A common problem facing those looking to deploy proprietary software on Linux, or for those trying to supply binaries to a very large user-base which will not compile your software themselves, is how to offer one binary that fits most normal scenarios.

There are generally four naive approaches to solving this problem.

  1. The developers set up a bunch of specific distros, and compile the software on each of them, and give out distro specific binaries. This makes sense at first, till you run into some trouble.
    • You have to juggle many live CDs or maintain a bunch of installed distros which is painful and time consuming.
    • You end up only offering support for the distros you have handy, and you will get quite a few users on a more exotic distro nagging you for support, or a different and incompatible version of a distro you're already supporting.
    • The compilers or other build utilities on some distros are too old for your modern software, and you need to build them elsewhere, or figure out how to back port modern software to that old distro.
    • Builds break when a user upgrades their system.
    • Users end up needing to install some non standard system libraries, increasing everyone's frustration.
  2. The developers just statically link the binaries. This isn't always a legal option due to some licenses that may be involved. Binaries which are fully statically linked also in many instances exhibit incorrect behavior (more on this later).
  3. The developers just compile the software on one system, and pray that it works for everyone else.
  4. Compile with a really really old distro and hope it works everywhere else. However this succumbs to the last three problems outlined in naive approach #1, and in many cases the binaries produced won't work with modern distros.
Now there are plenty of companies that supply those portable Linux binaries. You find on their website downloads for say Linux i386, AMD64, and PPC. Somehow that i386 binary manages to run on every i386 system you've tested, Red Hat, Debian, SUSE, Ubuntu, Gentoo, and both old and modern versions at that. What is their secret sauce?

Now let us dive into all the important information and techniques to accomplish this worthy goal.

First thing you want to know is what exactly is your binary linked to anyway? For this, the handy ldd command comes in.

/tmp> ldd myapp
        linux-vdso.so.1 =>  (0x00007fff7a1ff000)
        libstdc++.so.6 => /usr/lib/x86_64-linux-gnu/libstdc++.so.6 (0x00007f1f8a765000)
        libm.so.6 => /lib/x86_64-linux-gnu/libm.so.6 (0x00007f1f8a4e3000)
        libgcc_s.so.1 => /lib/x86_64-linux-gnu/libgcc_s.so.1 (0x00007f1f8a2cc000)
        libc.so.6 => /lib/x86_64-linux-gnu/libc.so.6 (0x00007f1f89f45000)
        /lib64/ld-linux-x86-64.so.2 (0x00007f1f8aaa9000)
/tmp>
The lines which are directed to a file are system libraries that you need to worry about. In this case, there's libstdc++, libm, libgcc, and libc. libc and libm are both part of (E)GLIBC, the C library that most Linux applications will be using. libstdc++ is GCC's C++ library. libgcc is GCC's implementation of some programming constructs that your program may be using, such as exception handling, and things like that.

In general (E)GLIBC is broken up into many sub libraries that your program may be linked against. Other notable examples are libdl for Dynamic Loading, libpthread for threading, librt for various real time functions, and a few others.

Your application will not run on a system unless all these dependencies are found, and are compatible. Therefore, versions of these also come into play. In general a newer minor version of a library will work, but not an older.

In order to find versions numbers, you want to use objdump. Here's an example with finding out what version of (E)GLIBC is needed:
/tmp> objdump -T myapp | grep GLIBC_
0000000000000000      DF *UND*  0000000000000000  GLIBC_2.2.5 ungetc
0000000000000000      DF *UND*  0000000000000000  GLIBC_2.3   __ctype_toupper_loc
0000000000000000      DF *UND*  0000000000000000  GLIBC_2.2.5 fputc
0000000000000000      DF *UND*  0000000000000000  GLIBC_2.2.5 free

/tmp>
In this case, 2.3 is the highest version number. Therefore this binary needs (E)GLIBC 2.3 or higher on the system. Note, the version numbers have nothing to do with the version installed on your system, rather (E)GLIBC marks each function with the minimum version that contains it.

Of course all this applies to other libraries as well, particularly libgcc and libstdc++.

Now that we know a little bit about what we're doing, I'm going to present the first bit of secret sauce.

If you're using C++, link with -static-libstdc++ this will ensure that libstdc++ is linked statically, but won't link every other lib statically like -static would. You want libstdc++ linked statically, because it's safe to do so, some systems may be using an older version (or have none at all, in the case of some servers), or you want your binary to remain compatible if a new major version of libstdc++ comes out which is no longer backwards compatible. Note that even though libstdc++ is GPL'd, it also offers a linking exception that allows you to link against it and even statically link it in any application.

If you see that your binary needs libgcc, also use -static-libgcc for the same reasons given above. Also, GCC is GPL'd, and has the same linking exception as above. I once had the unfortunate scenario where I sold a client an application without libgcc statically linked, that used exceptions. On his old server, as long as everything went absolutely perfectly, the application ran fine, but if any issue occurred, instead of gracefully handling the issue, the application terminated immediately. Since his libgcc was too old, the application saw the throws, but none of the catches. Statically linking libgcc fixed this issue.

Now you might be thinking, hey what about statically linking (E)GLIBC? Let me warn you that doing so is a bad idea. Some features in (E)GLIBC will only work if the statically linked (E)GLIBC is the exact same version of (E)GLIBC installed on the system, making statically linking pointless, if not downright problematic. (E)GLIBC's libdl is quite notable in this regard, as well as several networking functions. (E)GLIBC is also licensed under LGPL. Which essentially means that if you give out the source to your application, then in most cases you can distribute statically linked binaries with it, but otherwise, not. Also, since 99% of the functions are marked as requiring extremely old versions of (E)GLIBC, statically linking is hardly necessary in most cases.

The next bit of the secret sauce is statically linking those non standard libs your application needs but nothing else.

You probably never learned in school how to selectively static link those libraries you want, but it is indeed possible. Before the list of libraries you wish to static link, place -Wl,-Bstatic and afterwards -Wl,-Bdynamic.

Say in my application I want to statically link libcurl and OpenSSL, but want to dynamically link zlib, and the rest of my libs, such as other parts of (E)GLIBC, I would use the following as my link flags:
gcc -o app *.o -static-libgcc -Wl,-Bstatic -lcurl -lssl -lcrypto -Wl,-Bdynamic -lz -ldl -lpthread -lrt

The next step is to ensure that your libraries pull in as few dependencies as possible. Here's the output from ldd on my libcurl.so:
        linux-vdso.so.1 =>  (0x00007fffbadff000)
        libidn.so.11 => /usr/lib/x86_64-linux-gnu/libidn.so.11 (0x00007f84410a4000)
        libssh2.so.1 => /usr/lib/x86_64-linux-gnu/libssh2.so.1 (0x00007f8440e7b000)
        liblber-2.4.so.2 => /usr/lib/x86_64-linux-gnu/liblber-2.4.so.2 (0x00007f8440c6b000)
        libldap_r-2.4.so.2 => /usr/lib/x86_64-linux-gnu/libldap_r-2.4.so.2 (0x00007f8440a1a000)
        librt.so.1 => /lib/x86_64-linux-gnu/librt.so.1 (0x00007f8440812000)
        libgssapi_krb5.so.2 => /usr/lib/x86_64-linux-gnu/libgssapi_krb5.so.2 (0x00007f84405d2000)
        libssl.so.1.0.0 => /usr/lib/x86_64-linux-gnu/libssl.so.1.0.0 (0x00007f8440374000)
        libcrypto.so.1.0.0 => /usr/lib/x86_64-linux-gnu/libcrypto.so.1.0.0 (0x00007f843ff90000)
        librtmp.so.0 => /usr/lib/x86_64-linux-gnu/librtmp.so.0 (0x00007f843fd75000)
        libz.so.1 => /lib/x86_64-linux-gnu/libz.so.1 (0x00007f843fb5e000)
        libc.so.6 => /lib/x86_64-linux-gnu/libc.so.6 (0x00007f843f7d7000)
        libgcrypt.so.11 => /lib/x86_64-linux-gnu/libgcrypt.so.11 (0x00007f843f558000)
        libresolv.so.2 => /lib/x86_64-linux-gnu/libresolv.so.2 (0x00007f843f342000)
        libsasl2.so.2 => /usr/lib/x86_64-linux-gnu/libsasl2.so.2 (0x00007f843f127000)
        libgnutls.so.26 => /usr/lib/x86_64-linux-gnu/libgnutls.so.26 (0x00007f843ee66000)
        libpthread.so.0 => /lib/x86_64-linux-gnu/libpthread.so.0 (0x00007f843ec4a000)
        /lib64/ld-linux-x86-64.so.2 (0x00007f844157e000)
        libkrb5.so.3 => /usr/lib/x86_64-linux-gnu/libkrb5.so.3 (0x00007f843e976000)
        libk5crypto.so.3 => /usr/lib/x86_64-linux-gnu/libk5crypto.so.3 (0x00007f843e74c000)
        libcom_err.so.2 => /lib/x86_64-linux-gnu/libcom_err.so.2 (0x00007f843e548000)
        libkrb5support.so.0 => /usr/lib/x86_64-linux-gnu/libkrb5support.so.0 (0x00007f843e33f000)
        libdl.so.2 => /lib/x86_64-linux-gnu/libdl.so.2 (0x00007f843e13a000)
        libkeyutils.so.1 => /lib/x86_64-linux-gnu/libkeyutils.so.1 (0x00007f843df36000)
        libgpg-error.so.0 => /lib/x86_64-linux-gnu/libgpg-error.so.0 (0x00007f843dd32000)
        libtasn1.so.3 => /usr/lib/x86_64-linux-gnu/libtasn1.so.3 (0x00007f843db21000)
        libp11-kit.so.0 => /usr/lib/x86_64-linux-gnu/libp11-kit.so.0 (0x00007f843d90f000)
This is quite unacceptable. Distros generally compile packages with everything enabled. Your application generally does not need everything a library has to offer. In the case of libcurl, you can compile it yourself, and disable the features you aren't using. In an example application, I only need HTTP and FTP support, so I could compile libcurl with very little, and now have this:
        linux-vdso.so.1 =>  (0x00007fffbadff000)
        librt.so.1 => /lib/x86_64-linux-gnu/librt.so.1 (0x00007f8440812000)
        libssl.so.1.0.0 => /usr/lib/x86_64-linux-gnu/libssl.so.1.0.0 (0x00007f8440374000)
        libcrypto.so.1.0.0 => /usr/lib/x86_64-linux-gnu/libcrypto.so.1.0.0 (0x00007f843ff90000)
        libz.so.1 => /lib/x86_64-linux-gnu/libz.so.1 (0x00007f843fb5e000)
        libc.so.6 => /lib/x86_64-linux-gnu/libc.so.6 (0x00007f843f7d7000)
        libresolv.so.2 => /lib/x86_64-linux-gnu/libresolv.so.2 (0x00007f843f342000)
        libpthread.so.0 => /lib/x86_64-linux-gnu/libpthread.so.0 (0x00007f843ec4a000)
        /lib64/ld-linux-x86-64.so.2 (0x00007f844157e000)
        libdl.so.2 => /lib/x86_64-linux-gnu/libdl.so.2 (0x00007f843e13a000)
This is much more manageable. Refer to the documentation of your libraries in order to see how to compile them without  features you don't need.

If you're going to be compiling your own libraries, you probably want to set up a second system, virtual machine, or a chroot for building your customized library versions and applications, to ensure it doesn't conflict with your main system. Especially for the upcoming tip.

Secret sauce part 3, push your (E)GLIBC requirements down.

Let's look at an objdump on OpenSSL.
/tmp> objdump -T /usr/lib/libcrypto.so.0.9.8 | grep GLIBC_
0000000000000000      DF *UND*  0000000000000000  GLIBC_2.2.5 chmod
0000000000000000      DF *UND*  0000000000000000  GLIBC_2.2.5 fileno
0000000000000000      DF *UND*  0000000000000000  GLIBC_2.2.5 __sysv_signal
0000000000000000      DF *UND*  0000000000000000  GLIBC_2.2.5 printf
0000000000000000      DF *UND*  0000000000000000  GLIBC_2.2.5 memset
0000000000000000      DF *UND*  0000000000000000  GLIBC_2.2.5 ftell
0000000000000000      DF *UND*  0000000000000000  GLIBC_2.2.5 getgid
0000000000000000      DF *UND*  0000000000000000  GLIBC_2.2.5 shutdown
0000000000000000      DF *UND*  0000000000000000  GLIBC_2.2.5 close
0000000000000000      DF *UND*  0000000000000000  GLIBC_2.2.5 syslog
0000000000000000      DF *UND*  0000000000000000  GLIBC_2.2.5 ioctl
0000000000000000      DF *UND*  0000000000000000  GLIBC_2.2.5 abort
0000000000000000      DF *UND*  0000000000000000  GLIBC_2.2.5 memchr
0000000000000000      DF *UND*  0000000000000000  GLIBC_2.2.5 gethostbyname
0000000000000000      DF *UND*  0000000000000000  GLIBC_2.2.5 fseek
0000000000000000      DF *UND*  0000000000000000  GLIBC_2.7   __isoc99_sscanf
0000000000000000      DF *UND*  0000000000000000  GLIBC_2.2.5 openlog
0000000000000000      DF *UND*  0000000000000000  GLIBC_2.2.5 exit
0000000000000000      DF *UND*  0000000000000000  GLIBC_2.2.5 strcasecmp
0000000000000000      DF *UND*  0000000000000000  GLIBC_2.2.5 gettimeofday
0000000000000000      DF *UND*  0000000000000000  GLIBC_2.2.5 setvbuf
0000000000000000      DF *UND*  0000000000000000  GLIBC_2.2.5 read
0000000000000000      DF *UND*  0000000000000000  GLIBC_2.2.5 strncmp
0000000000000000      DF *UND*  0000000000000000  GLIBC_2.2.5 malloc
0000000000000000      DF *UND*  0000000000000000  GLIBC_2.2.5 fopen
0000000000000000      DF *UND*  0000000000000000  GLIBC_2.2.5 setsockopt
0000000000000000      DF *UND*  0000000000000000  GLIBC_2.2.5 sysconf
0000000000000000      DF *UND*  0000000000000000  GLIBC_2.2.5 getpid
0000000000000000      DF *UND*  0000000000000000  GLIBC_2.2.5 fgets
0000000000000000      DF *UND*  0000000000000000  GLIBC_2.2.5 geteuid
0000000000000000      DF *UND*  0000000000000000  GLIBC_2.2.5 vfprintf
0000000000000000      DF *UND*  0000000000000000  GLIBC_2.2.5 closelog
0000000000000000      DF *UND*  0000000000000000  GLIBC_2.2.5 fputc
0000000000000000      DF *UND*  0000000000000000  GLIBC_2.2.5 times
0000000000000000      DF *UND*  0000000000000000  GLIBC_2.2.5 free
0000000000000000      DF *UND*  0000000000000000  GLIBC_2.2.5 strlen
0000000000000000      DF *UND*  0000000000000000  GLIBC_2.2.5 ferror
0000000000000000      DF *UND*  0000000000000000  GLIBC_2.2.5 opendir
0000000000000000      DF *UND*  0000000000000000  GLIBC_2.2.5 __xstat
0000000000000000      DF *UND*  0000000000000000  GLIBC_2.2.5 listen
0000000000000000      DF *UND*  0000000000000000  GLIBC_2.3   __ctype_b_loc
0000000000000000  w   DF *UND*  0000000000000000  GLIBC_2.2.5 __cxa_finalize
0000000000000000      DF *UND*  0000000000000000  GLIBC_2.2.5 readdir
0000000000000000      DF *UND*  0000000000000000  GLIBC_2.2.5 dlerror
0000000000000000      DF *UND*  0000000000000000  GLIBC_2.2.5 sprintf
0000000000000000      DO *UND*  0000000000000000  GLIBC_2.2.5 stdin
0000000000000000      DF *UND*  0000000000000000  GLIBC_2.2.5 strrchr
0000000000000000      DF *UND*  0000000000000000  GLIBC_2.2.5 dlclose
0000000000000000      DF *UND*  0000000000000000  GLIBC_2.2.5 poll
0000000000000000      DF *UND*  0000000000000000  GLIBC_2.2.5 getegid
0000000000000000      DF *UND*  0000000000000000  GLIBC_2.2.5 gmtime_r
0000000000000000      DF *UND*  0000000000000000  GLIBC_2.2.5 strerror
0000000000000000      DF *UND*  0000000000000000  GLIBC_2.2.5 sigaction
0000000000000000      DF *UND*  0000000000000000  GLIBC_2.2.5 strcat
0000000000000000      DF *UND*  0000000000000000  GLIBC_2.2.5 getsockopt
0000000000000000      DF *UND*  0000000000000000  GLIBC_2.2.5 fputs
0000000000000000      DF *UND*  0000000000000000  GLIBC_2.2.5 lseek
0000000000000000      DF *UND*  0000000000000000  GLIBC_2.2.5 strtol
0000000000000000      DF *UND*  0000000000000000  GLIBC_2.2.5 getsockname
0000000000000000      DF *UND*  0000000000000000  GLIBC_2.2.5 connect
0000000000000000      DF *UND*  0000000000000000  GLIBC_2.2.5 memcpy
0000000000000000      DF *UND*  0000000000000000  GLIBC_2.2.5 memmove
0000000000000000      DF *UND*  0000000000000000  GLIBC_2.2.5 strchr
0000000000000000      DF *UND*  0000000000000000  GLIBC_2.2.5 socket
0000000000000000      DF *UND*  0000000000000000  GLIBC_2.2.5 fread
0000000000000000      DF *UND*  0000000000000000  GLIBC_2.2.5 __fxstat
0000000000000000      DF *UND*  0000000000000000  GLIBC_2.2.5 getenv
0000000000000000      DF *UND*  0000000000000000  GLIBC_2.2.5 __errno_location
0000000000000000      DF *UND*  0000000000000000  GLIBC_2.2.5 qsort
0000000000000000      DF *UND*  0000000000000000  GLIBC_2.2.5 strncasecmp
0000000000000000      DF *UND*  0000000000000000  GLIBC_2.2.5 strcmp
0000000000000000      DF *UND*  0000000000000000  GLIBC_2.2.5 strcpy
0000000000000000      DF *UND*  0000000000000000  GLIBC_2.2.5 getuid
0000000000000000      DF *UND*  0000000000000000  GLIBC_2.3   __ctype_tolower_loc
0000000000000000      DF *UND*  0000000000000000  GLIBC_2.2.5 memcmp
0000000000000000      DF *UND*  0000000000000000  GLIBC_2.2.5 feof
0000000000000000      DF *UND*  0000000000000000  GLIBC_2.2.5 fclose
0000000000000000      DF *UND*  0000000000000000  GLIBC_2.2.5 dlopen
0000000000000000      DF *UND*  0000000000000000  GLIBC_2.2.5 recvfrom
0000000000000000      DF *UND*  0000000000000000  GLIBC_2.2.5 strncpy
0000000000000000      DF *UND*  0000000000000000  GLIBC_2.2.5 dlsym
0000000000000000      DF *UND*  0000000000000000  GLIBC_2.2.5 closedir
0000000000000000      DO *UND*  0000000000000000  GLIBC_2.2.5 stderr
0000000000000000      DF *UND*  0000000000000000  GLIBC_2.2.5 fopen64
0000000000000000      DF *UND*  0000000000000000  GLIBC_2.2.5 sendto
0000000000000000      DF *UND*  0000000000000000  GLIBC_2.2.5 bind
0000000000000000      DF *UND*  0000000000000000  GLIBC_2.2.5 fwrite
0000000000000000      DF *UND*  0000000000000000  GLIBC_2.2.5 realloc
0000000000000000      DF *UND*  0000000000000000  GLIBC_2.2.5 perror
0000000000000000      DF *UND*  0000000000000000  GLIBC_2.2.5 fprintf
0000000000000000      DF *UND*  0000000000000000  GLIBC_2.2.5 localtime
0000000000000000      DF *UND*  0000000000000000  GLIBC_2.2.5 write
0000000000000000      DF *UND*  0000000000000000  GLIBC_2.2.5 accept
0000000000000000      DF *UND*  0000000000000000  GLIBC_2.2.5 strtoul
0000000000000000      DF *UND*  0000000000000000  GLIBC_2.2.5 open
0000000000000000      DF *UND*  0000000000000000  GLIBC_2.2.5 time
0000000000000000      DF *UND*  0000000000000000  GLIBC_2.2.5 fflush
0000000000000000      DF *UND*  0000000000000000  GLIBC_2.2.5 getservbyname
/tmp>
It seems that OpenSSL would work on (E)GLIBC 2.3, except for one pesky function which needs 2.7+. This is a problem if I want to ship an application with this modern OpenSSL on say Red Hat Enterprise Linux 5 which comes with GLIBC 2.5, or say Debian Stable from ~4 years ago, which only has 2.4.

In this case OpenSSL is using a C99 version of sscanf(), but not actually by choice.

In /usr/include/stdio.h on (E)GLIBC 2.7+, you'll notice two blocks:

#if defined __USE_ISOC99 && !defined __USE_GNU \
    && (!defined __LDBL_COMPAT || !defined __REDIRECT) \
    && (defined __STRICT_ANSI__ || defined __USE_XOPEN2K)
# ifdef __REDIRECT
/* For strict ISO C99 or POSIX compliance disallow %as, %aS and %a[
   GNU extension which conflicts with valid %a followed by letter
   s, S or [.  */
extern int __REDIRECT (fscanf, (FILE *__restrict __stream,
        __const char *__restrict __format, ...),
           __isoc99_fscanf) __wur;
extern int __REDIRECT (scanf, (__const char *__restrict __format, ...),
           __isoc99_scanf) __wur;
extern int __REDIRECT_NTH (sscanf, (__const char *__restrict __s,
            __const char *__restrict __format, ...),
         __isoc99_sscanf);
# else
extern int __isoc99_fscanf (FILE *__restrict __stream,
          __const char *__restrict __format, ...) __wur;
extern int __isoc99_scanf (__const char *__restrict __format, ...) __wur;
extern int __isoc99_sscanf (__const char *__restrict __s,
          __const char *__restrict __format, ...) __THROW;
#  define fscanf __isoc99_fscanf
#  define scanf __isoc99_scanf
#  define sscanf __isoc99_sscanf
# endif
#endif

And

# if !defined __USE_GNU \
     && (!defined __LDBL_COMPAT || !defined __REDIRECT) \
     && (defined __STRICT_ANSI__ || defined __USE_XOPEN2K)
#  ifdef __REDIRECT
/* For strict ISO C99 or POSIX compliance disallow %as, %aS and %a[
   GNU extension which conflicts with valid %a followed by letter
   s, S or [.  */
extern int __REDIRECT (vfscanf,
           (FILE *__restrict __s,
      __const char *__restrict __format, _G_va_list __arg),
           __isoc99_vfscanf)
     __attribute__ ((__format__ (__scanf__, 2, 0))) __wur;
extern int __REDIRECT (vscanf, (__const char *__restrict __format,
        _G_va_list __arg), __isoc99_vscanf)
     __attribute__ ((__format__ (__scanf__, 1, 0))) __wur;
extern int __REDIRECT_NTH (vsscanf,
         (__const char *__restrict __s,
          __const char *__restrict __format,
          _G_va_list __arg), __isoc99_vsscanf)
     __attribute__ ((__format__ (__scanf__, 2, 0)));
#  else
extern int __isoc99_vfscanf (FILE *__restrict __s,
           __const char *__restrict __format,
           _G_va_list __arg) __wur;
extern int __isoc99_vscanf (__const char *__restrict __format,
          _G_va_list __arg) __wur;
extern int __isoc99_vsscanf (__const char *__restrict __s,
           __const char *__restrict __format,
           _G_va_list __arg) __THROW;
#   define vfscanf __isoc99_vfscanf
#   define vscanf __isoc99_vscanf
#   define vsscanf __isoc99_vsscanf
#  endif
# endif

These two blocks of code make fscanf(), scanf(), sscanf(), vfscanf(), vscanf(), and vsscanf() use special C99 versions. Since older applications were already compiled against C89 versions, (E)GLIBC doesn't want to potentially break them and change how an existing function works. So instead, a new set of functions were created which only exist in (E)GLIBC 2.7+, and (E)GLIBC by default will direct all calls to these functions to the proper C99 versions when compiling.

Now there are some defines you can set in your library code and application code to ensure it uses the old more backwards compatible versions, but getting the exact right combination of defines without breaking anything else can be tricky. It may also be tedious to modify a code-base you're not familiar with.

Therefore, I recommend just deleting these two blocks from your <stdio.h> on your build system. You want your build system to be able to build everything for backwards compatibility, right?

If you're recompiling libraries like OpenSSL which are designed for massive portability with all kinds of systems, odds are, they're not looking for C99 support in basic scanf() family functions anyway. If you do happen to need C99 scanf() support in your application, I recommend that you add it manually with a specialized lib, for maximum portability. You can easily find a bunch online.

The last scenario that you may encounter is that you happen to want to use a modern library function. For most libs you can just statically link them, but that won't work for (E)GLIBC. Since some functions depend on system support, or that custom versions don't perform as well as the built in system ones, you definitely want to use the built in ones if they're available. The question is, how to once the binary has already been compiled?

So for our final bit of secret sauce, dynamically load any modern functions that you want to use, and work around them, or disable some functionality if not present.

Remember libdl that we mentioned above? It offers dlopen() for opening system libraries, and dlsym() for finding out if certain functions are present or not, and retrieving a pointer to them.

I'm going to post a full example that you can look at and play with. In this example, we have a program which tries to figure out how big system pipes are. In this application, we are going to see how much data we can stuff in a pipe before we're told that the pipe is full, and the write would need to block.

Linux offers a function called pipe2() which has the crucial ability to create a pipe in non-blocking mode. If it doesn't exist, we can create it ourselves, but we prefer the built in one if possible.

#ifndef __linux__
#error This program is specifically designed for Linux, even though it works elsewhere
#endif


#include <stdio.h> //puts(), fputs(), printf(), fprintf(), stderr
#include <errno.h> //errno, perror(), EINTR, EAGAIN, EWOULDBLOCK
#include <fcntl.h> //fcntl(), F_SETFL, F_GETFL, O_NONBLOCK, F_SETFD, F_GETFD, FD_CLOEXEC, and for some: O_CLOEXEC
#include <dlfcn.h> //dlopen(), dlsym(), dlclose(), dlerror(), RTLD_LAZY
#include <unistd.h> //pipe() used in our implementation, write(), close()



//Lifted from: /usr/include/<arch>/bits/fcntl.h
#ifndef O_CLOEXEC
#define O_CLOEXEC       02000000        /* set close_on_exec */
#endif
//End lift


typedef int (*pipe2_t)(int [2], int);

//Implement the rather straight forward pipe2(), note: this function is of type: static pipe2_t
static int our_pipe2(int pipefd[2], int flags)
{
  int ret = pipe(pipefd);
  if (!ret) //Success, pipe created
  {
    //The built in pipe2() would not suffer from race conditions that the following code would succumb to in a threaded application
    if (flags & O_NONBLOCK)
    {
      fcntl(pipefd[0], F_SETFL, fcntl(pipefd[0], F_GETFL) | O_NONBLOCK);
      fcntl(pipefd[1], F_SETFL, fcntl(pipefd[1], F_GETFL) | O_NONBLOCK);
    }

    if (flags & O_CLOEXEC)
    {
      fcntl(pipefd[0], F_SETFD, fcntl(pipefd[0], F_GETFD) | FD_CLOEXEC);
      fcntl(pipefd[1], F_SETFD, fcntl(pipefd[1], F_GETFD) | FD_CLOEXEC);
    }
  }
  return(ret);
}

static pipe2_t pipe2 = our_pipe2; //pipe2() is initialized to our function


size_t pipe_size() //Manually determine the size of the system's pipe, for automatic, look up Linux specific F_GETPIPE_SZ
{
  //Create a union for using a pipe, so usage is a bit more logical
  union
  {
    int pipefd[2];
    struct
    {
      int read;
      int write;
    } side;
  } u;

  size_t amount = 0; //A pipe size of 0 signifies unknown

  if (!pipe2(u.pipefd, O_NONBLOCK)) //Note, here pipe2() is used
  {
    for (;;) //Write to a pipe in a loop, the final amount should be the size of the pipe
    {
      ssize_t w = write(u.side.write, &amount, sizeof(size_t)); //Write a size_t to the pipe

      if (w > 0) { amount += w; } //Success, add amount written and then loop
      else if (w == 0) //Pipe was closed, and we certainly didn't close it
      {
        perror("Pipe unexpectedly closed");
        amount = 0; //Reset to unknown, because an error occured
        break;
      }
      else /* Error occured trying to write */ if (errno != EINTR) //And it wasn't an interruption, so something that needs handling
      {
        if ((errno != EAGAIN) && (errno != EWOULDBLOCK)) //Failed to write to pipe - and it's nothing we'd fix
        {
          perror("Failed to write to pipe");
          amount = 0; //Reset to unknown, because an error occured
        }
        //Else, pipe is full, we're done!

        break; //In either case, we're done writing to the pipe
      }
      //Else If (errno == EINTR), we'd just loop and try again
    }
    close(u.side.read);
    close(u.side.write);
  }
  else { perror("Failed to create pipe"); }

  return(amount);
}

int main(const int argc, const char *const *const argv)
{
  void *so = dlopen("libc.so.6", RTLD_LAZY); //Open the C library
  if (so)
  {
    void *sym = dlsym(so, "pipe2"); //Grab the handle to pipe2() if it exists
    if (sym) //Success!
    {
      pipe2 = (pipe2_t)sym; //Use the built in one instead of ours
      puts("Using system's pipe2().");
    }
    else { puts("Using our pipe2()."); }
  }
  else
  {
    puts("Using our pipe2().");
    fprintf(stderr, "Could not open C library: %s\n", dlerror());
  }

  //Here's the real work
  size_t a = pipe_size();
  if (a) { printf("Pipe size is: %zu\n", a); }
  else { fputs("Could not determine pipe size.\n", stderr); }

  if (so) { dlclose(so); }
  return(0);
}
Here's how to compile and run it:
/tmp> gcc -Wall -o pipe_test pipe_test.c -ldl
/tmp> ./pipe_test
Using system's pipe2().
Pipe size is: 65536
/tmp>
Now pipe2() Was added to GLIBC in 2.9, yet this binary here according to objdump only needs (E)GLIBC 2.2.5+. Here's the output from an older system with GLIBC 2.7, using the exact same binary created on a newer system:
/tmp> ./pipe_test
Using our pipe2().
Pipe size is: 65536
/tmp>
Lastly, let me recap all the techniques we learned.
  • Use ldd and objdump to see version requirements of binaries and libraries.
  • Statically link compiler and language libraries, such as libgcc and libstdc++.
  • Statically link selected libraries, while dynamically linking others.
  • Compile selected libraries with as little needed functionality as possible.
  • Pushing (E)GLIBC requirements down, by being wary of functions which have changed over time, and (E)GLIBC redirects calls to them in newly compiled programs by default.
  • Pushing (E)GLIBC requirements down by not directly using new functions, and instead working around their presence.
Doing all this, you'll still need to make different builds for different operating systems, and different architectures like x86 and ARM, but at least you won't be forced to for all different distros and versions thereof.

One thing of note, it's possible to have Linux with different C libraries, and in those cases, you may as well be using a different Operating System. You'll be hard pressed to make complex programs compiled against one C library run on Linux which uses a different C library, where the needed one is not present. Thankfully though, all the mainstream desktop and server distros all use (E)GLIBC.

In any case, the techniques you've learned here can also be applied to other setups too.  (E)GLIBC was only focused on in this article because of its popularity and its many gotchas, but many other libraries that you may use, particularly video and audio libraries have similar issues as well.

15 comments:

  1. When using ldd, you may also want to consider using it with -v. That will show you some version information as well, and also give you an idea as to which libs pull in other libs.

    ReplyDelete
  2. Nice article, but instead of deleting stuff from stdio.h (even it is in your buildsystem only) I recommend using a mapfile (or version-script) which specifes GLIBC_2.3 as highest usable version when linking the program.

    ReplyDelete
  3. Hello Justin Bailey.

    Can you explain how to use a mapfile or version script which achieves the objective explained here?

    ReplyDelete
  4. Sorry to make an o/t comment, but I was wondering if you might be able to remark on the current state of sound in Linux.

    I'm having issues where the stock installation of Ubuntu 12.04 (via Linux Mint because I HATE Unity) where ALSA/PulseAudio/whatever the hell they're using now won't see my sound card, but installing OSS4 will give me some sound.

    I'm planning on following your idea to make changes to /etc/asound.conf in order to have ALSA apps use OSS natively, but I wanted to know if this is still accurate. Can you confirm?

    ReplyDelete
  5. What I wrote in my sound articles is still true.

    I'm using the same ALSA to OSS wrap config today.

    ReplyDelete
  6. Interesting article, but I have one question about the guideline "Statically link selected libraries, while dynamically linking others.", namely: how do you decide which libraries to statically link, and which ones to dynamically link? Is there some kind of algorithm for this? What steps to follow in order to decide what to statically link, and what to dynamically link?

    ReplyDelete
  7. Statically those that aren't very likely to be installed, or those that have important reasons to be statically linked. Dynamically link everything else.

    ReplyDelete
  8. In theory, distributing source code seems to be an easy way to get around the problem, assuming your end user 1) has administrative access, 2) can install a compiler 2) knows how to run a configure script, 3) has the technical knowledge to interpret the output of a configure script and download the appropriate dependencies, 4) has technical expertise to resolve conflicts in library versions.

    ReplyDelete
  9. Truly eye-opening, but instead of deleting code from libraries, what do you think of this?

    https://github.com/wheybags/glibc_version_header

    ReplyDelete
  10. That glibc_version_header for glibc related versioning problems is certainly helpful.

    If something like those versioned headers exist for a problem you encounter, then by all means use them. However, mastering all the techniques here will get you out of most problems, including those that those versioned headers won't.

    ReplyDelete
  11. > Truly eye-opening, but instead of deleting code from libraries, what do you think of this?
    > https://github.com/wheybags/glibc_version_header

    I tried that for the example used here, and encountered a link error: "GLIBC_WRAP_ERROR_SYMBOL_NOT_PRESENT_IN_REQUESTED_VERSION".

    Those glibc version headers only works in cases where the versions of the functions are redirected to something which exists with the same behavior. For the example here, you need to delete the redirect that's being used in the header, or come up with a different method to link a replacement function.

    In general, it may also be tricky to use techniques like that with third party libraries that have their own build systems. However, modifying your headers that everything is using is one way to ensure the correct behavior will be used.

    ReplyDelete
  12. Attract Group's retail ecommerce solutions are top-notch, offering a seamless and user-friendly experience for both businesses and customers. The website design is sleek and modern, reflecting the high-quality services they provide. Their expertise in developing custom ecommerce platforms tailored to specific business needs is impressive. With a focus on innovation and efficiency, Attract Group is undoubtedly a trusted partner for businesses looking to thrive in the competitive online retail space.

    ReplyDelete
  13. Great news for all MBBS students! The National Eligibility cum Entrance Test (NEET) exam scores have just been released, opening up a plethora of opportunities for aspiring medical professionals. With this latest development, students can now heave a sigh of relief as their hard work and dedication have finally paid off.
    NEET, known for being one of the most prestigious medical entrance exams in India, plays a crucial role in determining students' admission into medical colleges across the country. The scores obtained in this exam serve as a gateway to pursue a Bachelor of Medicine, Bachelor of Surgery (MBBS) degree.
    Importance of NEET exam score for MBBS students
    The NEET exam score holds immense significance for MBBS students. It serves as a yardstick to measure their knowledge, aptitude, and readiness for the medical field. The score obtained in this exam not only determines their eligibility for admission but also plays a crucial role in securing a seat in their desired medical college. Read More
    A high NEET score reflects a student's dedication, hard work, and commitment towards their goal of becoming a medical professional. It showcases their ability to grasp complex concepts, apply theoretical knowledge to practical situations, and excel in the field of medicine. Read More
    NEET exam score and college admissions
    The NEET exam score is a determining factor for admissions into medical colleges across India. The score obtained by a student decides their rank in the national merit list, which is used by colleges to allocate seats. The higher the NEET score, the better the chances of securing admission in a top-tier medical college.

    ReplyDelete
  14. This comment has been removed by the author.

    ReplyDelete