Affects Version/s: 6.1.0 CE GA1, 6.1.10 EE GA1
Environment:Oracle Enterprise Linux 6.2
LPS-26698 made some changes to ensure that NamingEnumeration instances were closed correctly. The impact of not having the NamingEnumeration instances close correctly was the failure to recover threads in the LDAP thread pool. Under circumstances where the LDAP server is not functioning correctly, this could eventually lead to server failure on an LDAP import or with repeated authentication where the system is unable to allocate new native threads. LPS-26698 missed correctly closing two NamingEnumeration instances (com.liferay.portal.security.ldap.PortalLDAPUtil.java SVN revision 107571). These need to be corrected as show in the attached patch file (for SVN revision 116403).
For details on why this is causing a problem refer to http://blogs.warwick.ac.uk/kieranshaw/entry/ldap_connection_pooling/.
To attempt to reproduce the fault, you can follow these steps (This is how the problem was originally detected in our test environment using a file version prior to 107571 - Actually as part of stress testing OpenSSO logins with open.sso.ldap.import.enabled=true).
1. Include the following property in the portal-ext.properties file (The default value is 15 seconds):
2. Restart the portal and run a load test script that simply logs a user in and out repeatedly. Alternatively, configure the portal to do a LDAP import on start-up. Basically any action that will produce a large number of LDAP requests for data retrieved from LDAP.
3. Check the log files for any LDAP read-timeout errors. If necessary, lower the value of the property and restart. Basically, our server was returning results slightly slowly at one point, resulting in timeout errors.
4. Once you notice that there are a number of errors in the logs, allow the process to continue to run generating even more errors.
The failure to close the NamingEnumeration results in the connections to the LDAP server not being released correctly. As a result of the LDAP pool not being correctly configured (will create another issue for this), the number of LDAP connections grows unchecked. The number of connections can be seen to be growing as seen by:
netstat -tcp -p | grep name_of_your_ldap_server
You will also find that the number of threads locking in the JVM as a result of LDAP read requests increases (see thread dump). Eventually server failure will occur when no further native threads can be allocated.