Read on and leave a comment
or return to our portfolio.

11 Oct 2006

by Noel

Debugging File Handle Exhaustion

Dave has been working like a maniac to switch our database code over from SQLite to PostgreSQL. PostgreSQL has two main advantages: it is much faster, and we can open up ODBC connections to the database for other uses that don’t require a web interface. The change is now complete, however it hasn’t been without some difficulties. One problem that bit us was running out of file handles. If you ever have a similar problem, here is how to debug it.

On Linux the /proc filesystem reflects a great many kernel resources. The particularly interesting directories for our purposes are:

  • The files file-nr and file-max in /proc/sys/fs.
  • The per process directories keyed by process ID

The first thing to check is the value of /proc/sys/fs/file-max, which is the maximum number of file handles allowed on your system. This shouldn’t be a problem, but just ensure it isn’t something ridiculously small. On our system we get:

$ cat /proc/sys/fs/file-max
89367

That should be plenty under any reasonable usage, but we can check how many file handles are open by reading the value of/proc/sys/fs/file-nr. On our system this is:

$ cat /proc/sys/fs/file-nr
920 0 89367

This first number is the number of file handles in use. Definitely no problem there. It must be that a process is exceeding the per-process limit on file handles. In our setup this could be either PostgreSQL or MzScheme. We need the process IDs to find out how many handles each is using.

$ ps -A | grep postmaster
12936 ? 00:00:00 postmaster
12937 ? 00:00:00 postmaster
12939 ? 00:00:00 postmaster
12940 ? 00:00:00 postmaster
12941 ? 00:00:00 postmaster
$ ps -A | grep mzscheme
20382 ? 00:00:26 mzscheme

We can see how many handles are in use by looking in the directory for each process ID. For example, for the first PostgeSQL process:

$ sudo ls -l /proc/12936/fd/ | wc -l
4

So that PostgreSQL process is using 4 handles. The other processes are using similar numbers. So it must be our MzScheme process that is using up all the handles. We check that in a similar way, and the result is:

$ sudo ls -l /proc/20382/fd/ | grep socket | wc -l
193

Looks like we’ve found our culprit.

Posted in Web development | No Comments »

Comments are closed.