Affects Version/s: None
Fix Version/s: 5.2.0
We've been experiencing relatively poor LDAP performance under load, running Liferay 5.1.1 (and 5.1.2). We first experienced this under heavy load of a 5.1.1 instance hooked up to a high latency remote AD. We saw hundreds of open LDAP connections and very slow login times. The core issue appeared to be how LDAPAuth and LDAPPortalUtil managed LDAP connections:
1.) LDAP connections do not appear not pooled in either LDAPAuth or LDAPPortalUtil. It doesn't make sense to pool the connection used for an authentication bind in LDAPAuth but those used in LDAPPortalUtil can be pooled
2.) LDAPAuth (5.1.1 tag) will orphan Context objects without closing them when it performs the internal LDAP authentication (see LDAPAuth.java line 280 in the 5.1.1 tag, and trace back)
3.) Neither LDAPAuth nor LDAPPortalUtil close any naming enumerations; this leaves LDAP connections open (http://java.sun.com/docs/books/tutorial/jndi/ldap/close.html) in our environment
Because we are stuck with 5.1.1 for the time being, we've backported LDAPAuth + LDAPPortalUtil from trunk and added LDAP connection pooling for all LDAP connections except those used in the LDAPAuth#authenticate(ctx, companyId, attrs, fullUserDN, password); method. This involved an extra key/value pair in the context environment table and some extra values in system-ext.properties (com.sun.jndi.ldap.connect.pool.maxsize,com.sun.jndi.ldap.connect.pool.timeout,com.sun.jndi.ldap.connect.pool.prefsize) We've also closed all NamingEnumerations and made some changes that put Context#close invocations from various places within LDAPPortalUtil and LDAPAuth into try/catch/finally blocks to guarantee safe closure. Finally we put in PropsKeys and PropsValues from trunk (hopefully this didn't grenade anything outside of the whole login thing)
Our patch is against 5.1.1, but the important bits (NamingEnumeration#close, ldap connection pooling, Context#close, "internalContext" in LDAPAuth) should be easy to find and apply to 5.1.2 and trunk. The LDAP pooling used in the patch is Sun JVM specific, but it's easy to get rid of if you don't need it. Ideally we will probably add a property to portal.properties that can toggle whether or not the pooling is enabled. I don't want to toggle pooling globally by overriding the ldap.import.factory.initial property with a proprietary class because we presumably want to have pooling only in some contexts. Furthermore, this whole pooling thing makes TLS and failover quite a bit harder =)
We are not using the batch import feature, but importing all user attributes and properties on login, so we've only really tested the LDAPPortalUtil#importLDAPUser; we didn't do any in-depth testing of LDAPPortalUtil#importFromLDAP, but it seemed to work as expected as well.
In our specific configuration (high latency remote LDAP) we saw an order of magnitude increase in performance under load and saw the number of open LDAP connections go from hundreds to 5-20 depending on pool size. I'm not sure whether to attribute better resource cleanup or pooling to the performance bump, but I'm sure both fixes help in some regard.