Fixing File Descriptor Leaks in Long-Lived Servers

Fixing File Descriptor Leaks in Long-Lived Servers

So you’ve created a socket server that opens sockets over TCP/IP. After running it for a few hours under heavy load the operating system kills your process with the message “Too many open files”.

What happened?

The problem

On most POSIX operating systems, including Linux and Os X, each process is allocated a fixed table of file handles, or file descriptors. This is typically about 1024 handles. When you try to open 1025th file descriptor the operating system kills your process.

The problem is almost certainly that you are leaking file handles. That is, handles are being opened, and after you are done with them they are not closed.

Leaked file handles can come from many sources, not just open files. Some common sources are:

  • Sockets: Every socket also opens a file handle. Sockets need to be closed even if the remote party closes the connection. The handle is not automatically released if the connection closes.
  • Pipes: If you spawn sub-processes that communicate through pipes, often the pipe ends have to be duplicated ( with dup or dup2() ) and the right parts of the pipe need to be closed. If this isn’t done properly the server will leak file handles.
  • Database connections: When communicating with a database, even through a third party library, each connection will open some kind of file handle. SQL databases will typically open network sockets. File oriented databaseses like BerkeleyDB will open file handles, but the communication can also happen through named pipes. Each database API will have a call to close the connection.
  • Windows HANDLES: On windows there are a nearly infinite number of objects that hold handles. Each one must be closed properly
  • Files: You can also leak handles the old fashioned way by failing to close() handles to regular files.

One very sneaky source of open file handles is when you fork a process, or execute a subprocess. By default the new process inherits a copy of every open file handle. Even after the new process replaces itself with a new program image, and all the data structures of the old process are wiped away, the old process’ file handles will remain, unless they were set with the FD_CLOEXEC flag. More on this later.

Wrong Answers, Myths and Bad Ideas

Raise the file handle limit

One common answer to this problem is to just raise the limit of open file handles and then restart the server every day or every few hours.

On POSIX systems to find the operating system file handle limit you look in:

cat /proc/sys/fs/file-max

Or from a shell you can execute:

ulimit -n


To raise the limit:


ulimit -n 4096

This will delay the problem but likely will not fix it. It is possible that your program is not leaking and has a legitimate need to hold a large number of file handles. But if your program is designed correctly there usually isn’t a need to keep a large number of handles open – even if you have thousands of simultaneous connections. We’ll discuss some methods of managing that later.

If this was a good idea the operating system would already come configured with a higher file descriptor limit. If this was necessary, Apache would require you to up this limit before running.

Myth: Sockets in TCP TIME_WAIT are holding file handles Hostage

When you close a TCP/IP socket the operating system does not release the socket right away. For complex reasons, the socket structure must be kept out of circulation for a few minutes because there is a small chance that an IP packet may arrive on that socket after it has been closed. If the operating system re-used the socket then the new user of that connection would have their session affected by someone else’s lost packets.

But this does not hold a file handle open. When you close the socket’s file descriptor, the file descriptor itself is closed. You will not get the “Too many files open” error. If you have too many sockets open, then your server may stop accepting new connections. There are ways to deal with that ( allowing sockets to be re-used, or lowering TCP TIME_WAIT )- but raising the file handle limit isn’t one of them.

Myth: It takes time for file handles to be released

This is related to the TCP TIME_WAIT myth. The mistaken belief that when you close a file handle that you must wait some time for the operating system to release the handle.

Closing a file handle will call into whatever os method releases the resource, and the OS will release that resource either immediately, or sometimes later as in the case with sockets, but close() will release the file handle in the file handle table immediately. Your process is in complete control of its file handle table, and doesn’t need to wait for anything to free a slot in its own file descriptor table.

Finding and Fixing File Handle Leaks

Finding all your open file descriptors is easy. When you print them out it will probably be clear what type of handles are not being closed.

File handles will be integers from 0 to the handle limit minus one. So if you can find the handle limit, you can just query for information about each one. getdtablesize() will return the size of the number of available file handles for the current process. Once you you know that, you iterate over all the integers making any file handle os call you want, like fnctl( i, F_GETFD). If the call is successful, then the handle is currently open:

void showFDInfo()
{
   s32 numHandles = getdtablesize();

   for ( s32 i = 0; i < numHandles; i++ )
   {
      s32 fd_flags = fcntl( i, F_GETFD ); 
      if ( fd_flags == -1 ) continue;

      showFDInfo( i );
   }
}

For each valid file descriptor, you query it to get as much information about it as you can. Dumping its name and flags will help you identify its likely source:

void showFDInfo( s32 fd )
{
   char buf[256];

   s32 fd_flags = fcntl( fd, F_GETFD ); 
   if ( fd_flags == -1 ) return;

   s32 fl_flags = fcntl( fd, F_GETFL ); 
   if ( fl_flags == -1 ) return;

   char path[256];
   sprintf( path, "/proc/self/fd/%d", fd );

   memset( &buf[0], 0, 256 );
   ssize_t s = readlink( path, &buf[0], 256 );
   if ( s == -1 )
   {
        cerr << " (" << path << "): " << "not available";
        return;
   }
   cerr << fd << " (" << buf << "): ";

   if ( fd_flags & FD_CLOEXEC )  cerr << "cloexec ";

   // file status
   if ( fl_flags & O_APPEND   )  cerr << "append ";
   if ( fl_flags & O_NONBLOCK )  cerr << "nonblock ";

   // acc mode
   if ( fl_flags & O_RDONLY   )  cerr << "read-only ";
   if ( fl_flags & O_RDWR     )  cerr << "read-write ";
   if ( fl_flags & O_WRONLY   )  cerr << "write-only ";

   if ( fl_flags & O_DSYNC    )  cerr << "dsync ";
   if ( fl_flags & O_RSYNC    )  cerr << "rsync ";
   if ( fl_flags & O_SYNC     )  cerr << "sync ";

   struct flock fl;
   fl.l_type = F_WRLCK;
   fl.l_whence = 0;
   fl.l_start = 0;
   fl.l_len = 0;
   fcntl( fd, F_GETLK, &fl );
   if ( fl.l_type != F_UNLCK )
   {
      if ( fl.l_type == F_WRLCK )
         cerr << "write-locked";
      else
         cerr << "read-locked";
      cerr << "(pid:" << fl.l_pid << ") ";
   }
}

In /proc/self/fd you will find a descriptive name for every file handle. You can get this information from readlink call. The flags can also help you narrow down likely culprits.

The output looks something like this:

open file descriptors:
   
   0     (/dev/pts/2):                              read-write 
   1     (/dev/pts/2):                              read-write 
   2     (/dev/pts/2):                              read-write 
   3     (/blar/blar/blar/system.log):              cloexec append read-write 
   4     (/blar/blar/blar/access.log):              cloexec append read-write 
   5     (/blar/blar/blar/abuse.log):               cloexec append read-write 
   6     (anon_inode:[eventpoll]):                  cloexec read-write 
   7     (anon_inode:[eventfd]):                    cloexec nonblock read-write 
   8     (socket:[49863900]):                       cloexec nonblock read-write 
   9     (/blar/blar/blar/db/db_err):               cloexec append read-write 
   10    (/blar/blar/blar/db/blar.db):              cloexec read-write 
   11    (/blar/blar/blar/log.0000000001):          cloexec read-write 
   12    (socket:[50603354]):                       cloexec nonblock read-write 
   13    (socket:[50603364]):                       cloexec nonblock read-write 
   14    (socket:[50603365]):                       cloexec nonblock read-write 
   15    (socket:[50603366]):                       cloexec nonblock read-write 
   16    (socket:[50603367]):                       cloexec nonblock read-write 
   17    (socket:[50603368]):                       cloexec nonblock read-write 

This is from one of my own projects. Notice that all the file descriptors except the first three are set to close on execute.

If you still can’t figure it out, you could log every call that creates file descriptors ( e.g. fopen(), socket(), accept, dup(), pipeD() ) along with __FILE__ and __LINE__.

Limiting Open File Handles with Thousands of Sockets

If you have thousands of simultaneous network requests you can limit the number of open file descriptors by just not accept()‘ing every connection simultaneously.

If you really need to keep thousands of TCP connections open simultaneously then you really do need to up the file descriptor limit. But if your server is request based, like a web server, then don’t accept() them all at once. They will wait until you are ready, holding a socket open – but not consuming a file handle – until you are ready to handle them.

If you have this problem then you should be using one of many event handling APIs ( like select or epoll() ). These APIs will notify you when a socket is ready, and you only accept them as you have system resources available.

Also if you go this route, it is better to use a library like libev or libevent. Each operating system provides different event handling APIs. These libraries make it easier to use the best API for each system.

Dealing with Duplicate File Descriptors from Sub-Processes

When you spawn subprocesses they inherit all the open file descriptors of the parent process unless those descriptors have been specifically flagged as FD_CLOEXEC

This is because the operating system doesn’t know which file descriptors will be used by the subprocess for inter-process communication. Or if the subprocess will be the one that will handle an open network socket.

Normally when you spawn subprocesses the only file descriptors that need to stay open are the pipes that are connected to STDIN, STDOUT, and STDERR of the child process. But every subprocess is different.

On POSIX systems, you flag a file descriptor to not be duplicated on exec like this:

bool setCloExec( s32 fd, bool val )
{
   s32 flags = fcntl( fd, F_GETFD );
   if ( flags == -1 ) return false;

   if ( val )
   {
      if ( fcntl( fd, F_SETFD, flags |  FD_CLOEXEC ) == -1 ) return false;
   } else {
      if ( fcntl( fd, F_SETFD, flags & ~FD_CLOEXEC ) == -1 ) return false;
   }

   return true;
}

On Win32 you do it like this:

bool setCloExec( u32 handle, bool val )
{
   if ( val )
   {
      if ( SetHandleInformation( (HANDLE)handle, HANDLE_FLAG_INHERIT, 0 )) return true;
   } else {
      if ( SetHandleInformation( (HANDLE)handle, HANDLE_FLAG_INHERIT, 1 )) return true;
   }

   assert(0); // failed to set inheritance flag.
   return true;
}

Right before your server calls fork() or exec(), call showFdInfo() to make sure that all file descriptors are labeled FD_CLOEXEC except for the ones you need to be duplicated.

Dealing with Multiple Database Connections

Don’t open a new database connection for every database request. This is just wasteful and unnecessary. Each database connection may open a new TCP/IP socket, which will double the number of sockets in TCP_WAIT, even after you close the database connection, and therefore the file descriptors.

Open just one, or a limited number of database connections, and pass the open database connection to each request handling thread, protected by a mutex of course.

Closing File Descriptors when Spawning Subprocesses

To communicate with subprocesses very often you open pipes that connect the two processes, typically by passing the three pipes connected to STDIN, STDOUT and STDERR of the child process.

Before the process is forked or a new process spawned, you create a pipe and then duplicate the pipe. You then have a pipe will have 2 read ends and 2 write ends. Each process will then close three of the associated handles. So for a pipe that writes from the parent to the child, the parent process closes the two read ends, and one of the write ends. The child process then closes the two write ends, and closes one of the read ends. At the end of the spawn the parent process is left with a write file descriptor, and the child a read file descriptor. When the child terminates it doesn’t need to close the read end. But the parent must still close the write end.

The best explanation of how to use pipes for inter-process communication is here.
You can find more details about how to do this on Windows here and how to do this on Linux and Os X here.

Resources:

Unix Pipes A really good explanation of pipes and IPC
Creating a Child Process with Redirected Input and Output How to spawn processes and deal with IPC in Windows
Pipe, Fork, Exec and Related Topics A tutorial on how to spawn processes on POSIX
fork() and closing file descriptors An article on using FD_CLOEXEC to close file descriptor leaks
Socket accept – “Too many open files” A stackoverflow thread about this problem
What is the cost of many TIME_WAIT on the server side? A stackoverflow thread about TCP TIME_WAIT and file descriptors.