Generate Stack Traces on Crash Portably in C++

There are three basic steps to getting this done, trapping signals, getting the stack frames, and then demangling the c++ symbols.

Trapping Signals

When a program crashes the operating system will sent it a signal, to give it one last chance to wrap up what it is doing, before the OS terminates the process. Your program can intercept the signal, and run code. You could for example try to recover from the error. In this case all we want to do is print out a stack trace.

Trapping the signal is pretty straight-forward. But first a little housekeeping. We’re going to create this system as a trivial c++ class that only has a constructor:

class KxStackTrace
{
public:
    KxStackTrace();
};

We will instantiate this class as a singleton in global scope, or you could instantiate it in main(). The constructor will register the signal handlers, so all that matters is that a single instance is created sometime before you need it.

To start handling signals you only need to register a signal handler using the signal function:

#include <stdio.h>
#include <signal.h>

KxStackTrace::KxStackTrace()
{
  signal( SIGABRT, abortHandler );
  signal( SIGSEGV, abortHandler );
  signal( SIGILL,  abortHandler );
  signal( SIGFPE,  abortHandler );
}

The signal() function is a posix standard and is supported in Windows, Linux and OSX. When a given signal is generated it will call the handler function in the second argument. You can trap any operating system signal. In this case we are trapping the most important signals that are generated when the OS detects an execution problem.

SIGABRT is generated when the program calls the abort() function, such as when an assert() triggers.

SIGSEGV and SIGBUS are generated when the program makes an illegal memory access, such as reading unaligned memory, dereferencing a NULL pointer, reading memory out of bounds etc. SIGBUS is not supported under Windows.

SIGILL is generated when the program tries to execute a malformed instruction. This happens when the execution pointer starts reading non-program data, or when a pointer to a function is corrupted.

SIGFPE is generated when executing an illegal floating point instruction, most commonly division by zero or floating point overflow.

To get a list of signals supported by your operating system run the following command:

$ kill -l
 1) SIGHUP       2) SIGINT       3) SIGQUIT      4) SIGILL       5) SIGTRAP
 6) SIGABRT      7) SIGEMT       8) SIGFPE       9) SIGKILL     10) SIGBUS
11) SIGSEGV     12) SIGSYS      13) SIGPIPE     14) SIGALRM     15) SIGTERM
16) SIGURG      17) SIGSTOP     18) SIGTSTP     19) SIGCONT     20) SIGCHLD
21) SIGTTIN     22) SIGTTOU     23) SIGIO       24) SIGXCPU     25) SIGXFSZ
26) SIGVTALRM   27) SIGPROF     28) SIGWINCH    29) SIGPWR      30) SIGUSR1

Now we implement the signal handler function itself:

void abortHandler( int signum )
{
   // associate each signal with a signal name string.
   const char* name = NULL;
   switch( signum )
   {
   case SIGABRT: name = "SIGABRT";  break;
   case SIGSEGV: name = "SIGSEGV";  break;
   case SIGBUS:  name = "SIGBUS";   break;
   case SIGILL:  name = "SIGILL";   break;
   case SIGFPE:  name = "SIGFPE";   break;
   }

   // Notify the user which signal was caught. We use printf, because this is the 
   // most basic output function. Once you get a crash, it is possible that more 
   // complex output systems like streams and the like may be corrupted. So we 
   // make the most basic call possible to the lowest level, most 
   // standard print function.
   if ( name )
      fprintf( stderr, "Caught signal %d (%s)\n", signum, name );
   else 
      fprintf( stderr, "Caught signal %d\n", signum );

   // Dump a stack trace.
   // This is the function we will be implementing next.
   printStackTrace();

   // If you caught one of the above signals, it is likely you just 
   // want to quit your program right now.
   exit( signum );
}

Using SigAction on POSIX Systems

The example above will work on both Windows and POSIX systems. On any reasonably modern POSIX platform you can use the newer sigaction() function. This does the same thing but provides additional information about the signal, and a little more control over which signals you handle.

void KxStackTrace::KxStackTrace()
{
   struct sigaction sa;
   sa.sa_flags = SA_SIGINFO;
   sa.sa_sigaction = abortHandler;
   sigemptyset( &sa.sa_mask );

   sigaction( SIGABRT, &sa, NULL );
   sigaction( SIGSEGV, &sa, NULL );
   sigaction( SIGBUS,  &sa, NULL );
   sigaction( SIGILL,  &sa, NULL );
   sigaction( SIGFPE,  &sa, NULL );
   sigaction( SIGPIPE, &sa, NULL );
}

The handler call will need to have some additional arguments:

void abortHandler( int signum, siginfo_t* si, void* unused )
{
}

We won’t use the extra information in our code here, but with sigaction() you can find out stuff like the process id and user id of the sending process ( e.g. if a kill signal was sent from another process ). And you can get the address of the code that generated the signal, in the case of a crash. Its nice to know you have options.

Printing a Basic Stack Trace

Linux and OsX

On Linux and its Posix variants ( like OsX ), you will want to use backtrace() and backtrace_symbols(). The data is returned in buffers allocated by malloc(). To be clean you will want to free them before you exit.

backtrace() returns the raw stack trace information, as an array of stack frames. If the length is zero then it is likely the stack was corrupt, and you are done. If you get some stack frames then you can turn them into a somewhat readable form using backtrace_symbols():

#include <execinfo.h>

static inline void printStackTrace( FILE *out = stderr, unsigned int max_frames = 63 )
{
   fprintf(out, "stack trace:\n");

   // storage array for stack trace address data
   void* addrlist[max_frames+1];

   // retrieve current stack addresses
   u32 addrlen = backtrace( addrlist, sizeof( addrlist ) / sizeof( void* ));

   if ( addrlen == 0 ) 
   {
      fprintf( out, "  \n" );
      return;
   }

   // create readable strings to each frame.
   char** symbollist = backtrace_symbols( addrlist, addrlen );

   // print the stack trace.
   for ( u32 i = 4; i < addrlen; i++ )
      fprintf( out, "%s\n", symbollist[i]);

   free(symbollist);
}

Note that we don’t print the first four stack frames. These are not interesting because they will be the functions called by the stack trace code itself. The program function that generated the signal will be at offset 4.

On Linux the output looks something like this:

Caught signal 6 (SIGABRT)
stack trace:
  /lib/libc.so.6(abort+0x180) [0x7f96afdf9f70]
  /lib/libc.so.6(__assert_fail+0xf1) [0x7f96afdf02b1]
  ./bin/linux-x86_64/debug/kxf_test() [0x4359fd]
  ./bin/linux-x86_64/debug/kxf_test(_Z8testPathv+0x29b4) [0x4156e8]
  ./bin/linux-x86_64/debug/kxf_test(_Z6doTestPKcPFvvE+0xc8) [0x424c89]
  ./bin/linux-x86_64/debug/kxf_test(main+0xe4) [0x424de2]
  /lib/libc.so.6(__libc_start_main+0xfd) [0x7f96afde3c4d]
  ./bin/linux-x86_64/debug/kxf_test() [0x412c79]

Note that the C++ symbols are still mangled and unreadable. The output for OsX will be somewhat different, but similarly unreadble. We will need to demangle the symbols. More on that later.

In the mean time, if you have a program that dumps mangled stack traces, you can cut them, and paste them into my online gcc c++ demangler, to get demangled symbols.

Windows

Getting and printing the stack trace on Windows is a nightmare. To start with the functions needed to dump the stack are not available by default, but must be loaded in using dbghelp.dll or imagehlp.dll. The stack dumping function itself has changed from StackWalk to StackWalk64. Finding your symbols themselves is complex.

I use a prebuilt stack dumping class called StackWalker, modified somewhat to remove stuff I don’t need.

You could use the StackWalker unmodified, and derive from that class to change the output. I just simplified the code to make the changes I need. If you use StackWalker, then dumping a stack trace becomes pretty easy:

#include "StackWalker.h"

static inline void printStackTrace( FILE *out = stderr )
{
   fprintf( out, "stack trace:\n" );
   StackWalker sw;
   sw.ShowCallstack();
}

Unlike the POSIX version, this one will print your c++ stack symbols demangled by default, which is nice. It also prints the source file name and line number, which the POSIX version doesn’t do.

Caught signal 22 (SIGABRT)
stack trace:
  f:\dd\vctools\crt_bld\self_x86\crt\src\abort.c               (74):   abort
  f:\dd\vctools\crt_bld\self_x86\crt\src\assert.c              (336):  _wassert
  c:\users\rafael\desktop\milk2\kxf\src\win32\kxf_pathos.cpp   (740):  KxfPath::listFiles
  c:\users\rafael\desktop\milk2\kxf\src\kxf_test\test_path.cpp (213):  testPath
  c:\users\rafael\desktop\milk2\kxf\src\kxf_test\kxf_test.cpp  (25):   doTest
  c:\users\rafael\desktop\milk2\kxf\src\kxf_test\kxf_test.cpp  (55):   main
  f:\dd\vctools\crt_bld\self_x86\crt\src\crt0.c                (278):  __tmainCRTStartup
  f:\dd\vctools\crt_bld\self_x86\crt\src\crt0.c                (189):  mainCRTStartup
  75A2339A      (kernel32):     (filename not available):      BaseThreadInitThunk
  77679EF2      (ntdll):        (filename not available):      RtlInitializeExceptionChain
  77679EC5      (ntdll):        (filename not available):      RtlInitializeExceptionChain

Demanging GCC C++ Symbols

We’re almost done. We just need to make the POSIX output more readable. Demangling gcc symbols is easy:

#include <cxxabi.h>

...
int status;
size_t unmangledNameSize = 1024;
char unmangledName[1024];
char* ret = abi::__cxa_demangle( mangledName, &unmangledName[0], 
                                 &unmangledNameSize, &status );
...

You have to allocate the buffer for the unmangled name yourself. In the above example we just grab a small buffer off the stack. The entire function is below. It has a bunch of uninteresting string parsing code that parses out the mangled names, passes them to __cxa_demangle(), and prints them out. It does it slightly differently for Linux and OsX.

#include <execinfo.h>
#include <errno.h>
#include <cxxabi.h>

static inline void printStackTrace( FILE *out = stderr, unsigned int max_frames = 63 )
{
   fprintf(out, "stack trace:\n");

   // storage array for stack trace address data
   void* addrlist[max_frames+1];

   // retrieve current stack addresses
   unsigned int addrlen = backtrace( addrlist, sizeof( addrlist ) / sizeof( void* ));

   if ( addrlen == 0 ) 
   {
      fprintf( out, "  \n" );
      return;
   }

   // resolve addresses into strings containing "filename(function+address)",
   // Actually it will be ## program address function + offset
   // this array must be free()-ed
   char** symbollist = backtrace_symbols( addrlist, addrlen );

   size_t funcnamesize = 1024;
   char funcname[1024];

   // iterate over the returned symbol lines. skip the first, it is the
   // address of this function.
   for ( unsigned int i = 4; i < addrlen; i++ )
   {
      char* begin_name   = NULL;
      char* begin_offset = NULL;
      char* end_offset   = NULL;

      // find parentheses and +address offset surrounding the mangled name
#ifdef DARWIN
      // OSX style stack trace
      for ( char *p = symbollist[i]; *p; ++p )
      {
         if (( *p == '_' ) && ( *(p-1) == ' ' ))
            begin_name = p-1;
         else if ( *p == '+' )
            begin_offset = p-1;
      }

      if ( begin_name && begin_offset && ( begin_name < begin_offset ))
      {
         *begin_name++ = '\0';
         *begin_offset++ = '\0';

         // mangled name is now in [begin_name, begin_offset) and caller
         // offset in [begin_offset, end_offset). now apply
         // __cxa_demangle():
         int status;
         char* ret = abi::__cxa_demangle( begin_name, &funcname[0],
                                          &funcnamesize, &status );
         if ( status == 0 ) 
         {
            funcname = ret; // use possibly realloc()-ed string
            fprintf( out, "  %-30s %-40s %s\n",
                     symbollist[i], funcname, begin_offset );
         } else {
            // demangling failed. Output function name as a C function with
            // no arguments.
            fprintf( out, "  %-30s %-38s() %s\n",
                     symbollist[i], begin_name, begin_offset );
         }

#else // !DARWIN - but is posix
      // not OSX style
      // ./module(function+0x15c) [0x8048a6d]
      for ( char *p = symbollist[i]; *p; ++p )
      {
         if ( *p == '(' )
            begin_name = p;
         else if ( *p == '+' )
            begin_offset = p;
         else if ( *p == ')' && ( begin_offset || begin_name ))
            end_offset = p;
      }

      if ( begin_name && end_offset && ( begin_name &lt; end_offset ))
      {
         *begin_name++   = '\0';
         *end_offset++   = '\0';
         if ( begin_offset )
            *begin_offset++ = '\0';

         // mangled name is now in [begin_name, begin_offset) and caller
         // offset in [begin_offset, end_offset). now apply
         // __cxa_demangle():

         int status = 0;
         char* ret = abi::__cxa_demangle( begin_name, funcname,
                                          &funcnamesize, &status );
         char* fname = begin_name;
         if ( status == 0 ) 
            fname = ret;

         if ( begin_offset )
         {
            fprintf( out, "  %-30s ( %-40s	+ %-6s)	%s\n",
                     symbollist[i], fname, begin_offset, end_offset );
         } else {
            fprintf( out, "  %-30s ( %-40s	  %-6s)	%s\n",
                     symbollist[i], fname, "", end_offset );
         }
#endif  // !DARWIN - but is posix
      } else {
         // couldn't parse the line? print the whole line.
         fprintf(out, "  %-40s\n", symbollist[i]);
      }
   }

   free(symbollist);
}

The new stack trace output looks something liks this:

kxf_test: src/posix/kxf_pathos.cpp:824: virtual bool KxfPath::listFiles(
KxVector&, const KxfPath&, bool) const: Assertion `0′ failed.
Caught signal 6 (SIGABRT)
stack trace:
/lib/libc.so.6 ( abort + 0x180 ) [0x7f1b6987ff70]
/lib/libc.so.6 ( __assert_fail + 0xf1 ) [0x7f1b698762b1]
./bin/linux-x86_64/debug/kxf_test ( ) [0x435a5d]
./bin/linux-x86_64/debug/kxf_test ( testPath() + 0x29b4) [0x415748]
./bin/linux-x86_64/debug/kxf_test ( doTest(char const*, void (*)()) + 0xc8 ) [0x424ce9]
./bin/linux-x86_64/debug/kxf_test ( main + 0xe4 ) [0x424e42]
/lib/libc.so.6 ( __libc_start_main + 0xfd ) [0x7f1b69869c4d]
./bin/linux-x86_64/debug/kxf_test ( ) [0x412cd9]

Global Objects with Side Effects

There is one final thing to be aware of. This is implemented as a trivial class that simply has a single constructor that registers the signal handlers. The singleton object does not ever have to be referred to anywhere else in your program. If you initialize the object inside of main(), or in any of your regular code everything will work fine. But if you do, crash dumps will not be available until after the start of main(). It would be better to instantiate this object in global scope so that the constructor runs before main. The problem is that if you do this in a library, and the global object is never referred to anywhere else, the linker ( at least in gcc ), will not link this object in and the constructor will never fire. The answer is to allocate it as a member of some other global object that you know for certain will be referenced in any program. In my case, I just allocate the object as part of my memory manager, which will always be used:

class KxMemGlobals
{
...
   KxStackTrace  mStackTrace;
...
};

Resources:

C/C++ signal handling Lists all the standard signals
CodeProject: StackWalker A Windows StackWalking class.