nm command

The nm(1) command can report the list of symbols in a given library. It works on both static and shared libraries. For a given library nm(1) can list the symbol names defined, each symbol's value, and the symbol's type. It can also identify where the symbol was defined in the source code (by filename and line number), if that information is available in the library (see the -l option).

The symbol type requires a little more explanation. The type is displayed as a letter; lowercase means that the symbol is local, while uppercase means that the symbol is global (external). Typical symbol types include T (a normal definition in the code section), D (initialized data section), B (uninitialized data section), U (undefined; the symbol is used by the library but not defined by the library), and W (weak; if another library also defines this symbol, that definition overrides this one).

If you know the name of a function, but you truly can't remember what library it was defined in, you can use nm's ``-o'' option (which prefixes the filename in each line) along with grep to find the library name. From a Bourne shell, you can search all the libraries in /lib, /usr/lib, direct subdirectories of /usr/lib, and /usr/local/lib for ``cos'' as follows:

nm -o /lib/* /usr/lib/* /usr/lib/*/* \
      /usr/local/lib/* 2> /dev/null | grep 'cos$' 

Much more information about nm can be found in the nm ``info'' documentation locally installed at info:binutils#nm.

Library constructor and destructor functions

Libraries should export initialization and cleanup routines using the gcc __attribute__((constructor)) and __attribute__((destructor)) function attributes. See the gcc info pages for information on these. Constructor routines are executed before dlopen returns (or before main() is started if the library is loaded at load time). Destructor routines are executed before dlclose returns (or after exit() or completion of main() if the library is loaded at load time). The C prototypes for these functions are:

  void __attribute__ ((constructor)) my_init(void);
  void __attribute__ ((destructor)) my_fini(void);

Shared libraries must not be compiled with the gcc arguments ``-nostartfiles'' or ``-nostdlib''. If those arguments are used, the constructor/destructor routines will not be executed (unless special measures are taken).

Special functions _init and _fini (OBSOLETE/DANGEROUS)

Historically there have been two special functions, _init and _fini that can be used to control constructors and destructors. However, they are obsolete, and their use can lead to unpredicatable results. Your libraries should not use these; use the function attributes constructor and destructor above instead.

If you must work with old systems or code that used _init or _fini, here's how they worked. Two special functions were defined for initializing and finalizing a module: _init and _fini. If a function ``_init'' is exported in a library, then it is called when the library is first opened (via dlopen() or simply as a shared library). In a C program, this just means that you defined some function named _init. There is a corresponding function called _fini, which is called whenever a client finishes using the library (via a call dlclose() that brings its reference count to zero, or on normal exit of the program). The C prototypes for these functions are:

  void _init(void);
  void _fini(void);

In this case, when compiling the file into a ``.o'' file in gcc, be sure to add the gcc option ``-nostartfiles''. This keeps the C compiler from linking the system startup libraries against the .so file. Otherwise, you'll get a ``multiple-definition'' error. Note that this is completely different than compiling modules using the recommended function attributes. My thanks to Jim Mischel and Tim Gentry for their suggestion to add this discussion of _init and _fini, as well as help in creating it.

Shared Libraries Can Be Scripts

It's worth noting that the GNU loader permits shared libraries to be text files using a specialized scripting language instead of the usual library format. This is useful for indirectly combining other libraries. For example, here's the listing of /usr/lib/libc.so on one of my systems:

/* GNU ld script
   Use the shared library, but some functions are only in
   the static library, so try that secondarily.  */
GROUP ( /lib/libc.so.6 /usr/lib/libc_nonshared.a )

For more information about this, see the texinfo documentation on ld linker scripts (ld command language). General information is at info:ld#Options and info:ld#Commands, with likely commands discussed in info:ld#Option Commands.

Symbol Versioning and Version Scripts

Typically references to external functions are bound on an as-needed basis, and are not all bound when the application starts up. If a shared library is out of date, a required interface may be missing; when the application tries to use that interface, it may suddenly and unexpectedly fail.

A solution to this problem are symbol versioning coupled with version scripts. With symbol versioning, the user can get a warning when they start their program if the libraries being used with the application are too old. You can learn more about this from ld manual's descussion of version scripts at http://www.gnu.org/manual/ld-2.9.1/html_node/ld_25.html.

GNU libtool

If you're building an application that should port to many systems, you might consider using GNU libtool to build and install libraries. GNU libtool is a generic library support script. Libtool hides the complexity of using shared libraries behind a consistent, portable interface. Libtool provides portable interfaces to create object files, link libraries (static and shared), link executables, debug executables, install libraries, install executables. It also includes libltdl, a portability wrapper for dynamically loading programs. For more information, see its documentation at http://www.gnu.org/software/libtool/manual.html

Removing symbols for space

All the symbols included in generated files are useful for debugging, but take up space. If you need space, you can eliminate some of it.

The best approach is to first generate the object files normally, and do all your debugging and testing first (debugging and testing is much easier with them). Afterwards, once you've tested the program thoroughly, use strip(1) to remove the symbols. The strip(1) command gives you a good deal of control over what symbols to eliminate; see its documentation for details.

Another approach is to use the GNU ld options ``-S'' and ``-s''; ``-S'' omits debugger symbol information (but not all symbols) from the output file, while ``-s'' omits all symbol information from the output file. You can invoke these options through gcc as ``-Wl,-S'' and ``-Wl,-s''. If you always strip the symbols and these options are sufficient, feel free, but this is a less flexible approach.

Extremely small executables

You might find the paper Whirlwind Tutorial on Creating Really Teensy ELF Executables for Linux useful. It describes how to make a truly tiny program executable. Frankly, you shouldn't use most of these tricks under normal circumstances, but they're quite instructive in showing how ELF really works.

C++ vs. C

It's worth noting that if you're writing a C++ program, and you're calling a C library function, in your C++ code you'll need to define the C function as extern "C". Otherwise, the linker won't be able to locate the C function. Internally, C++ compilers ``mangle'' the names of C++ functions (e.g., for typing purposes), and they need to be told that a given function should be called as a C function (and thus, not have its name mangled).

If you're writing a program library that could be called from C or C++, it's recommended that you include 'extern "C"' commands right in your header files so that you do this automatically for your users. When combined with the usual #ifndef at the top of a file to skip re-executing header files, this means that a typical header file usable by either C or C++ for some header file foobar.h would look like this:

/* Explain here what foobar does */

#ifndef FOOBAR_H
#define FOOBAR_H

#ifdef __cplusplus
extern "C" {

 ... header code for foobar goes here ...

#ifdef  __cplusplus

Speeding up C++ initialization

The KDE developers have noticed that large GUI C++ applications can take a long time to start up, in part due to its needing to do many relocations. There are several solutions to this. See Making C++ ready for the desktop (by Waldo Bastian) for more information.

Link-Time Optimization

GCC 4.5 has added a new optimization option, -flto. When source files are compiled and linked using -flto, GCC applies optimizations as if all the source code were in a single file, which lets it perform far more aggressive optimizations. You have to also use an optimization level (-Owhatever). It obsoletes the old -combine, but you might want to seriously consider combining it with -fwhole-program. If you want to do this, you need to use -flto at all steps of the process, and you should use exactly the same optimization and machine-dependent options throughout. The article "What's new in GCC 4.5?" in Linux Weekly News (LWN) has more information.

Linux Standard Base (LSB)

The goal of the Linux Standard Base (LSB) project is to develop and promote a set of standards that will increase compatibility among Linux distributions and enable software applications to run on any compliant Linux system. The project's home page is at http://www.linuxbase.org.

A nice article that summarizes how to develop LSB-compliant applications was published in October 2002, Developing LSB-certified applications: Five steps to binary-compatible Linux applications by George Kraft IV (Senior software engineer, IBM's Linux Technology Center). Of course, you need to write code that only accesses the standardized portability layer if you want your code to be portable. In addition, the LSB provides some tools so that application writers of C/C++ programs can check for LSB compliance; these tools use some capabilities of the linker and special libraries to do these checks. Obviously, you'll need to install the tools to do these checks; you can get them from the LSB website. Then, simply use the "lsbcc" compiler as your C/C++ compiler (lsbcc internally creates a linking environment that will complain if certain LSB rules aren't followed):

 $ CC=lsbcc make myapplication
 $ CC=lsbcc ./configure; make myapplication 
You can then use the lsbappchk program to ensure that the program only uses functions standardized by the LSB:
 $ lsbappchk myapplication
You also need to follow the LSB packaging guidelines (e.g., use RPM v3, use LSB-conforming package names, and for add-on software must install in /opt by default). See the article and LSB website for more information.

Merging Libraries into Larger Shared Libraries

What if you want to first create smaller libraries, then later merge them into larger libraries? In this case, you may find ld's "--whole-archive" option useful, which can be used to forcibly bring .a files and link them into an .so file.

Here's an example of how to use --whole-archive:

 gcc -shared -Wl,-soname,libmylib.$(VER) -o libmylib.so $(OBJECTS) \
 -Wl,--whole-archive $(LIBS_TO_LINK) -Wl,--no-whole-archive \

As the ld documentation notes, be sure to use --no-whole-archive option at the end, or gcc will try to merge in the standard libraries as well. My thanks to Kendall Bennett for both suggesting this recipe, as well as providing it.

Other Information

Red Hat's Ulrich Drepper How to Write Shared Libraries (August 28, 2004) has more information that you may find useful.