Symbols on Linux Part One: g++ Library Symbols

After many years programming solely on Windows I have recently started working on Linux – Ubuntu to be precise. One of the things that I have had to learn is the very different ways in which Linux developers deal with symbols. This series of tutorials will share what I have learned, aimed at developers who are new to Linux. This first post will cover how to get symbols for the C and C++ libraries for gcc, with a focus on how one can find this information.

This post was updated January 10, 2013, to explain how to get the C++ symbols without adding a new repository, and again on January 15, 2013 to better explain ddebs.ubuntu.com and to add a summary reference section. It was updated again March 2013 to link to later posts and to explain how to install i386 packages on 64-bit Linux.

Future posts on this topic include:

Symbols?

The term ‘symbols’ is not entirely correct for what I’m talking about, especially on Linux. What I really mean is debugging information, which includes internal symbols, mappings between instructions and source files, type information, variable names, etc. However “debugging information” sometimes feels too verbose. Just translate in your head if it bothers you.

The importance of symbols

When debugging or profiling it is crucial to have symbols. I’ve blogged before about the importance of having third-party symbols when profiling, and symbols are also important when debugging, whether live debugging or looking at crash/core dumps. When you are building your own code you should always build with symbols. You needn’t ship symbols to your customers, and having symbols doesn’t affect the performance of your code, so there is really no reason not to build with symbols. You should also try to have symbols for as much as possible of the third party code – static and dynamic libraries – that you use.

Missing symbol symptoms

Let’s imagine that we’ve written some buggy code that does the moral equivalent of this transparently incorrect code:

printf(“Hello world – %s!\n”, (const char*)17);

We’re gonna crash, in the C library, and our gdb session (or cgdb, or ddd, or whatever your favorite Linux debugger shell is) is going to look something like this:

Program received signal SIGSEGV, Segmentation fault.
0xb7d7fe29 in vfprintf () from /lib/i386-linux-gnu/libc.so.6
(gdb) bt
#0 0xb7d7fe29 in vfprintf () from /lib/i386-linux-gnu/libc.so.6
#1 0xb7d87eff in printf () from /lib/i386-linux-gnu/libc.so.6
#2 0x080485c1 in main (argc=1, argv=0xbffff3e4) at test.cpp:7

gdb is saying that we crashed in vfprintf, called by printf. It turns out that that’s not actually correct – vfprintf is merely the closest public symbol to where we crashed. gdb also doesn’t know what parameters or source files are associated with the top two functions in the stack, which complicates debugging when tracking real bugs.

If you type “info shared” into gdb then you will see results something like this:

(gdb) info shared
From        To          Syms Read   Shared Object Library
0xb7fde820  0xb7ff6b9f  Yes (*)     /lib/ld-linux.so.2
0xb7e36f10  0xb7f6b5cc  Yes (*)     /lib/i386-linux-gnu/libc.so.6
(*): Shared library is missing debugging information.

The (*) indicates that no debugging information is available for two shared objects, which probably explains our lack of information about the location where we crashed.

Getting gcc library symbols

In order to correct this we need to figure out what package these files come from. On Debian-style versions of Linux the tools for managing packages are mainly dpkg, apt-get, and aptitude. A bit of man-page reading finds that dpkg -S can find what package installed a particular file. Therefore the following two commands will tell us what packages the shared objects come from:

$ dpkg -S /lib/ld-linux.so.2
libc6: /lib/ld-linux.so.2
$ dpkg -S libc.so.6
libc6: /lib/i386-linux-gnu/libc.so.6

This tells us that both shared objects come from the libc6 package, which is the GNU C Library. Notice that it isn’t necessary to pass a fully specified path to dpkg -S, although passing just the file name will occasionally lead to ambiguous answers.

Most packages have symbols stripped in order to save space and download bandwidth – many Linux users don’t need symbols. Instead the symbols are available from a separate package. The convention is that this package has -dbg or -dbgsym appended to it. The –dbg packages are hand-built for critical packages such as libc6. The -dbgsym packages are the new standard on Ubuntu for publishing symbols, and are published in separate repositories, to reduce the size and bandwidth requirements of the main repositories.

In the case of libc6 there are both libc6-dbg and libc6-dbgsym packages, but initially only libc6-dbg will install, so that’s what we do, with this syntax:

$ sudo apt-get install libc6-dbg

Now when we debug our program and issue the “info shared” command we have symbols for all of our shared objects:

(gdb) info shared
From        To          Syms Read   Shared Object Library
0xb7fde820  0xb7ff6b9f  Yes         /lib/ld-linux.so.2
0xb7e36f10  0xb7f6b5cc  Yes         /lib/i386-linux-gnu/libc.so.6

That wasn’t too painful, and now if we crash inside of the GNU C Library, or if our profiler shows it on the call stack, we’ll have accurate symbolic names. Our debugging session now looks something like this:

Program received signal SIGSEGV, Segmentation fault.
0xb7d7fe29 in _IO_vfprintf_internal (s=0xb7ee1a20, format=<optimized out>, ap=0xbffff338) at vfprintf.c:1630
(gdb) bt
#0 0xb7d7fe29 in _IO_vfprintf_internal (s=0xb7ee1a20, format=<optimized out>, ap=0xbffff338) at vfprintf.c:1630
#1 0xb7d87eff in __printf (format=0x8048710 “Hello world – %s!\n”) at printf.c:35
#2 0x080485c1 in main (argc=1, argv=0xbffff3e4) at test.cpp:7

The details don’t show up particularly well in a narrow blog post, but we have the correct name for the function we crashed in, parameter values, and file names and line numbers that will allow us to locate the crash in the libc source code should we decide to download it.

Note: There can be some confusion if you look at the description for libc6-dbg because it is listed as being symbols for the Embedded GNU C Library. It turns out that EGLIBC has been the standard for Debian based variants of Linux for several years, even when they aren’t running on embedded devices.

Note that if you are running an x64 version of Linux then by default apt-get will install 64-bit packages. If you want the 32-bit version of libc6 and its symbols, for developing and debugging 32-bit programs, you need to append :i386 to the package names, like this:

sudo apt-get install libc6:i386
sudo apt-get install libc6-dbg:i386


More symbols – the same but far more complicated

All is then well until you start using C++ features, and then some new shared modules show up with missing symbols. After adding some use of std::cout to our test program and running info shared from gdb we see the following results (edited to avoid unsightly word wrapping):

(gdb) info shared
From        Syms Read   Shared Object Library
0xb7fde820  Yes         /lib/ld-linux.so.2
0xb7f2ad00  Yes (*)     /usr/lib/i386-linux-gnu/libstdc++.so.6
0xb7d51f10  Yes         /lib/i386-linux-gnu/libc.so.6
0xb7d12430  Yes         /lib/i386-linux-gnu/libm.so.6
0xb7cf1f50  Yes (*)     /lib/i386-linux-gnu/libgcc_s.so.1
(*): Shared library is missing debugging information.

Sigh. Yes, we’ve still got two shared objects with no debugging information. The dpkg -S dance tells that they come from two different packages – libstdc++.so.6 (the C++ library), and libgcc1 which is the gcc support library:

$ dpkg -S /usr/lib/i386-linux-gnu/libstdc++.so.6
libstdc++6: /usr/lib/i386-linux-gnu/libstdc++.so.6
$ dpkg -S /lib/i386-linux-gnu/libgcc_s.so.1
libgcc1: /lib/i386-linux-gnu/libgcc_s.so.1

Installing libgcc1-dbg works smoothly:

$ sudo apt-get install libgcc1-dbg

The following NEW packages will be installed:
   libgcc1-dbg (4.6.3-1ubuntu5)
0 upgraded, 1 newly installed, 0 to remove and 18 not upgraded.

However things do not go so smoothly with libstdc++6-dbg and libstdc++6-dbgsym:

$ sudo apt-get install libstdc++6-dbg

Package libstdc++6-dbg is not available, but is referred to by another package.
This may mean that the package is missing, has been obsoleted, or
is only available from another source
E: Package ‘libstdc++6-dbg’ has no installation candidate
$ sudo apt-get install libstdc++6-dbgsym
E: Unable to locate package libstdc++6-dbgsym
E: Couldn’t find any package by regex ‘libstdc++6-dbgsym’

At this point the search for symbols can quickly become a dead-end, but I have found two solution. One is documented in the New repositories – ddebs section and the other, suggested by a reader of this post, is in the Versioned C++ section.

New repositories – ddebs

While the normal Ubuntu repositories contain a few -dbg packages, the ddebs.ubuntu.com repository should contain -dbgsym packages for almost everything. The ddebs site is created using pkg-create-dbgsym and is the new way to get debug packages built for virtually everything. In particular ddebs.ubuntu.com contains the libstdc++6-dbgsym package. However ddebs.ubuntu.com is not accessible by default because most users don’t need it. You can find descriptions of how to add this repository here and here. The main thing to do is to append the following four lines to /etc/apt/sources.list (must be root) in order to make all sixteen variants of updates for Ubuntu 12.04 (precise pangolin) available:

deb http://ddebs.ubuntu.com precise main restricted universe multiverse
deb
http://ddebs.ubuntu.com precise-updates main restricted universe multiverse
deb
http://ddebs.ubuntu.com precise-security main restricted universe multiverse
deb
http://ddebs.ubuntu.com precise-proposed main restricted universe multiverse

But you’re not done yet. You need to add the public key so that this repository will be trusted. The links above explain how to do this or you can use the method documented here: navigate to http://ddebs.ubuntu.com with your browser, download dbgsym-release-key.asc, and then run the Ubuntu Software Center and use Edit-> Software Sources-> Authentication-> Import Key File… to add it.

With that done we are ready for our final obstacle. Since libstdc++6-dbg is still not accessible we install libstdc++6-dbgsym, which leads to this frustrating message:

$ sudo apt-get install libstdc++6-dbgsym

The following packages will be REMOVED:
   libgcc1-dbg (4.6.3-1ubuntu5)
The following NEW packages will be installed:
   libstdc++6-dbgsym (4.6.3-1ubuntu5)
0 upgraded, 1 newly installed, 1 to remove and 18 not upgraded.

Got that? For reasons that I do not yet understand, the symbols for libgcc_s.so.1 and libstdc++.so.6 cannot be installed simultaneously. They don’t install any of the same files, but there must be some hidden conflict between the two packages. I had to choose which symbols I would keep installed. Conveniently enough, on one project with libgcc1-dbg installed gdb would crash at startup, which made the choice easy. libstdc++6-dbgsym gets to stay.

Versioned C++

Another way of getting the libstdc++6 symbols was suggested in a comment to this article by James McCoy. It turns out that you can have multiple versions of libstdc++6 installed and you can ask to have a specific version of the symbols installed. The first thing we need to do is get a list of all of the packages that could conceivably be related to libstdc++. The apt-cache search command will do that for us, but we have to be careful. A search on libstdc++6 will find no hits, even though this library is clearly installed. That’s because the “+” characters have special regex meaning. We need to escape them with a backslash, and then we either need to escape the back slashes or put the whole parameter in quotes to stop the shell from processing the backslashes. That leaves us this command:

apt-cache search -n “libstdc\+\+6”

The important parts of the output are here:

libstdc++6 – GNU Standard C++ Library v3
libstdc++6-4.4-dbg – GNU Standard C++ Library v3 (debugging files)
libstdc++6-4.4-dev – GNU Standard C++ Library v3 (development files)
libstdc++6-4.4-doc – GNU Standard C++ Library v3 (documentation files)
libstdc++6-4.5-dbg – GNU Standard C++ Library v3 (debugging files)
libstdc++6-4.5-dev – GNU Standard C++ Library v3 (development files)
libstdc++6-4.5-doc – GNU Standard C++ Library v3 (documentation files)
libstdc++6-4.6-dbg – GNU Standard C++ Library v3 (debugging files)
libstdc++6-4.6-dev – GNU Standard C++ Library v3 (development files)
libstdc++6-4.6-doc – GNU Standard C++ Library v3 (documentation files)
Various arm and -pic variations elided…

We have three versions of libstdc++6 symbols available so we need to figure out which one to install in order to match the version of libstdc++ on our machine. We can find that version information with dpkg -s.

$ dpkg -s libstdc++6

Package: libstdc++6

Version: 4.6.3-1ubuntu5

With that we are ready to install our symbols:

sudo apt-get install libstdc++6-4.6-dbg

This also installs libgcc1-dbg, so this should give full symbols. The debug package does cause one weird cgdb startup problem:

(gdb) r
Starting program: /home/bruced/tests/floattest/a.out
Traceback (most recent call last):
  File “/usr/lib/debug/usr/lib/i386-linux-gnu/libstdc++.so.6.0.16-gdb.py”, line 62, in <module>
    from libstdcxx.v6.printers import register_libstdcxx_printers
ImportError: No module named libstdcxx.v6.printers

Breakpoint 1, main (argc=1, argv=0xbffff3e4) at test.cpp:7
(gdb)

Apparently you can put a python script in the debug path and have it executed when the .dbg file is loaded. And on some machines this python script hits an error when trying to load the pretty printers. I renamed the python script and that resolved the error. That will do for now.

It’s potentially handy to be able to install symbols for old versions of libstdc++6. However when you install these it uninstalls whatever symbols you have for other versions. And, it seems distinctly asymmetric that there are three versions of symbols listed but only one version of the library itself. I cannot discern the intent. The inconsistency seems quite odd.

Symbols for other packages

Once you’ve got the ddebs repositories set up you can easily install symbols for most packages. For instance, if you crash in libX11.so.6 you can use dpkg -S to find what package that file comes from and then just add -dbgsym to the package name and install that.

$ dpkg -S libX11.so.6
libx11-6: /usr/lib/i386-linux-gnu/libX11.so.6
$ sudo apt-get install libx11-6-dbgsym

How does it work?

Once the symbols are installed gdb automatically finds them, through a very simple mechanism. The C++ library is located at

/usr/lib/i386-linux-gnu/libstdc++.so.6

but this is just a symbolic link to

/usr/lib/i386-linux-gnu/libstdc++.so.6.0.16

Meanwhile the symbols are installed in:

/usr/lib/debug/usr/lib/i386-linux-gnu/libstdc++.so.6.0.16

It turns out that gdb maintains a colon separated list of debug directories which can be viewed like this:

(gdb) show debug-file-directory
The directory where separate debug symbols are searched for is “/usr/lib/debug”.

By default this list of debug directories contains just one entry, and symbols are looked for by appending the full shared object path to this directory. There are other symbol searching techniques used by gdb, but this is all we need to know for now. If you want to add an additional directory to the symbol search path you can do it with this gdb command:

(gdb) set debug-file-directory /usr/lib/debug:/home/bruced/mysymbols

Source also!

I just realized (February 2015) that I never explained how to install the libc source – maybe I didn’t know how to when I wrote this. It’s pretty easy. Just create a directory to hold the source and run this:

apt-get source libc6

That’s it. Unfortunately you then need to tell gdb to find the source, and I have yet to find an easy and complete way to do this. In fact, because libc6 is built without full source paths I think it is impossible to easily and completely tell gdb where all of the libc6 source-code is. That is unfortunate. libc6 should fix their build process to correct that, IMHO. Anyway, what I end up doing is adding lines like this to ~/.gdbinit. As I find myself needing more directories I add them. It’s not perfect, but it works:

directory /data/home/bruced/libcsource/eglibc-2.15/stdio-common/
directory /data/home/bruced/libcsource/eglibc-2.15/malloc/

See my Steam Dev Days talk and slides for more details.

Summary

In order to make this post more useful as a reference here is a summary of what was covered:

  • dpkg -S libc.so.6: finds what package an installed file comes from
  • sudo apt-get install libc6-dbg: installs the debug symbols for the libc6 package
  • sudo apt-get install libc6-dbg:i386: installs the debug symbols for the 32-bit x86 libc6 package when running on 64-bit Linux
  • apt-cache search -n “libstdc\+\+6”: does a regex search through all the package names and descriptions – escape characters and quotes are needed in this case
  • sudo apt-get install libstdc++6-4.6-dbg: installs the debug symbols for a particular version of libstdc++
  • dpkg -s libstdc++6: gets details about an installed package including its version information
  • show debug-file-directory: gdb command to show the colon separated list of debug directories
  • set debug-file-directory: gdb command to set the colon separated list of debug directories

Acknowledgements

Thanks to all my co-workers and members of the Linux community who have helped me figure out what I know so far. Apologies if I have misrepresented any of what I have been taught, and I hope that this information is useful.

About brucedawson

I'm a programmer, working for Google, focusing on optimization and reliability. Nothing's more fun than making code run 10x faster. Unless it's eliminating large numbers of bugs. I also unicycle. And play (ice) hockey. And juggle.
This entry was posted in Linux, Programming, Symbols and tagged , , , , , . Bookmark the permalink.

18 Responses to Symbols on Linux Part One: g++ Library Symbols

  1. Bob Pelerson says:

    Any chance of adding FreeBSD to the list of supported platforms?

    Linux is alright but it would be a darn shame if I had to give up FreeBSD to game.

  2. James McCoy says:

    Ubuntu does have the debug packages for libstdc++6 available without adding the ddeb repos. However those repos will help if you want to debug other packages which don’t explicitly provide their own debug packages.

    What tripped you up is that debug packages for libstdc++6 are specific to the compiler version. “apt-cache search -n libstdc\\+\\+6” (the escapes are so apt-cache doesn’t treat the + as regex metacharacters) will show all the packages with “libstd++6” in their name. You should see libstdc++6-4.4-dbg, libstdc++6-4.5-dbg, and libstdc++6-4.6-dbg along with a bunch of other libstdc++6 related packages.

    • brucedawson says:

      The great thing about blogging is learning from the comments that people post, so thank you. However…

      I thought that the package management system was supposed to handle versioning for me? The documentation I’ve read — and the example of libc6 — suggests that appending -dbg or -dbgsym should find the correct debug package for the package you have installed, including the correct version. Installing old versions of libc6-dbg is actually quite tricky, so it didn’t occur to me to worry about versioning.

      The alternative you suggest requires significant extra knowledge, including realizing that the compiler version is relevant, checking it, assuming that gcc 4.6.3 requires libstdc++6-4.6, and then installing that package.

      What’s worse is that following these steps on one of my machines caused gdb to hit startup problems: “File /usr/lib/debug/usr/lib/i386-linux-gnu/libstdc++.so.6.0.16-gdb.py, line 62, Import Error: No module named libstdcxx.v6.printers”. Weird.

      I’m also slightly confused by why the escaping of the search command requires double-backslashes.

      Anyway, I appreciate your comment, even if it leaves me more confused than before!

      • James McCoy says:

        The package management system does handle versioning for you, but g++ versions 4.4, 4.5, and 4.6 are co-installable. It would probably make sense for there to be a libstdc++6-dbg that aligns with the g++ version installed by the g++ metapackage, so the simple case of “apt-get install g++ libstdc++6-dbg” would work. It looks like the packaging used to be setup like that. I’m not sure why it isn’t anymore.

        As for the double-backslashes, if you only typed libstdc\+\+6 then all you’ve done is told the shell not to treat the + specially. You’re still passing an unescaped + into apt-cache. In order for apt-cache to see the backslash, you need to either escape the backslash again (as I did above) or quote the argument — apt-cache search ‘libstdc\+\+6’.

        • brucedawson says:

          Ah — of course it’s the shell that is the other level of parsing. And if the shell treated ‘+’ as a special character then you would need triple backslashes! I like the quoted argument with single backslashes better.

          I’m still getting used to the implications of having the shell doing globbing. The difference between p4 have * and p4 have “*” is pretty important.

          Anyway, now I just need to figure out why I’m getting that import error on one machine. And I will update the post to include your observations.

  3. Wyatt says:

    You’ve reminded me anew how spoiled I am to have Gentoo Portage as my package manager. -g in my CFLAGS and FEATURES=split-debug and life is quite nice. None of of this inane dance to get headers and symbols, and much more usable tools for searching and management.

    I won’t presume to dictate what Valve should and shouldn’t target, but I would definitely encourage you to give us a shot some time.

  4. Sigfried mcWild says:

    Everything in Gentoo is built on the machine, the bootstrap packages are rebuilt during the (manual) installation process. The kernel is always built by hand (I don’t think it’s even managed by the package manager, only the headers are)

  5. Pingback: Symbols on Linux Part Two: Symbols for Other Versions | Random ASCII

  6. Pingback: Symbols on Linux Part Three: Linux versus Windows | Random ASCII

  7. Pingback: Symbols on Linux update: Fedora Fixes | Random ASCII

  8. Pingback: Symbols the Microsoft Way | Random ASCII

  9. Pingback: Counting to Ten on Linux | Random ASCII

  10. After I watched your “Getting Started Debugging on Linux” I simply fell in love with the way you explain stuff and how reasonable you are about making considerations. You are a very knowledgeable person and we are lucky that you share you knowledge with us.

    Thanks a lot and keep up the superb work!

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s