SAS & JBoss: Too Many Open Files

I’ve been seeing some ‘Too many open files‘ exceptions in the SAS® mid-tier JBoss logs on my Ubuntu Linux server. I was surprised about this because I remember during installation I had followed the guidance in the SAS documentation and increased the nofile limit. It turns out that verifying the ulimit from a normal console/ssh login was not sufficient, I should have also verified it from an su based login too.

These were the sorts of messages I was seeing in the JBoss logs:

2012-03-11 09:22:38,463 ERROR [org.apache.tomcat.util.net.JIoEndpoint] Socket accept failed
java.net.SocketException: Too many open files
...
2012-03-11 08:57:19,454 ERROR [org.apache.catalina.core.StandardContext] Error reading tld listeners java.io.FileNotFoundException: /opt/jboss-4.2.3.GA/server/SASServer3/work/jboss.web/localhost/SASBIDashboard/tldCache.ser (Too many open files)
...

There are some Pre-Installation Steps for JBoss for both SAS 9.3 and SAS 9.2 that should be followed during installation to avoid these errors. These were the instructions I had followed, but as you’ll see in a moment, it wasn’t quite enough for this (unsupported) Ubuntu installation.

The instructions specify to edit the /etc/security/limits.conf file directly. Rather than editing this main config file, where the settings might get forgotten or lost during an upgrade, I placed the settings I required for my SAS installation in their own dedicated config file: /etc/security/limits.d/sas.conf

# Increase the open file descriptors limit from the default of 1024 to 30720 for JBoss running web apps for SAS 9.2/9.3
* - nofile 30720

I knew that this had taken effect because I logged in, via ssh, to verify it as the sas user (I run JBoss as the sas user). Checking the ulimit I saw the following:

sas@server:~$ ulimit -Hn;ulimit -Sn
30720
30720

With nofile at 30270, how was it I was still getting ‘Too many open files‘ errors? After a quick session on Google I found a blog post suggesting the increased limits will only apply if the pam_limits PAM module is enabled.

Checking the /etc/pam.d/login file I could see the pam_limits line was already present and uncommented:
...
session required pam_limits.so
...

This made sense since the console/ssh login showed the expected numbers.

Google also led me to a stackoverflow question (how do i set hard and soft file limits for a non-root user at boot?). The answer provided there indicated that, for su commands, you also need to verify the pam_limits module is enabled in an additional su specific PAM config file, which on my machine was /etc/pam.d/su. My JBoss init script runs as root during system startup but uses su to run JBoss as the sas user. Looking in /etc/pam.d/su I could see that the pam_limits line was commented so perhaps that was the issue.

Before making the necessary changes, I verified the nofile ulimit for the sas user by running su as root:

root@server:~# su sas --login --command 'ulimit -Hn;ulimit -Sn'
1024
1024

Aha! It had the 1024 default rather then the increased value. It looked like this was indeed the problem. I uncommented the pam_limits line in /etc/pam.d/su and repeated the test:

root@server:~# su sas --login --command 'ulimit -Hn;ulimit -Sn'
30720
30720

It now shows the increased value as expected, so it looks like the problem’s fixed. I restarted JBoss and haven’t seen any ‘Too many open files‘ errors since.