Uploaded image for project: 'PUBLIC - Liferay Portal Community Edition'
  1. PUBLIC - Liferay Portal Community Edition
  2. LPS-97074

Improve PerFieldAnalyzer performance with caching

    Details

      Description

      7.x versions are not affected, we no longer have this code in place.


      Thread dumps revealed many CPU heavy threads. The reason is that PerFieldAnalyzer is not caching the previously found analyzers, resulting in always repeating the pattern matching for ddm and localized fields.

      An example of CPU heavy tread:

         java.lang.Thread.State: RUNNABLE
      	at java.lang.Character.codePointAt(Character.java:4668)
      	at java.util.regex.Pattern$CharProperty.match(Pattern.java:3693)
      	at java.util.regex.Pattern$Curly.match0(Pattern.java:4158)
      	at java.util.regex.Pattern$Curly.match(Pattern.java:4132)
      	at java.util.regex.Matcher.match(Matcher.java:1221)
      	at java.util.regex.Matcher.matches(Matcher.java:559)
      	at java.util.regex.Pattern.matches(Pattern.java:1130)
      	at com.liferay.portal.search.lucene.PerFieldAnalyzer.getAnalyzer(PerFieldAnalyzer.java:63)
      	at com.liferay.portal.search.lucene.LuceneHelperImpl.isLikeField(LuceneHelperImpl.java:620)
      	at com.liferay.portal.search.lucene.LuceneHelperImpl.addTerm(LuceneHelperImpl.java:235)
      	at com.liferay.portal.search.lucene.LuceneHelperImpl.addTerm(LuceneHelperImpl.java:220)
      	at com.liferay.portal.search.lucene.LuceneHelperUtil.addTerm(LuceneHelperUtil.java:268)
      	at com.liferay.portal.search.lucene.LuceneHelperUtil.addTerm(LuceneHelperUtil.java:262)
      	at com.liferay.portal.search.lucene.LuceneHelperUtil.addTerm(LuceneHelperUtil.java:256)
      	at com.liferay.portal.search.lucene.BooleanQueryImpl.addTerm(BooleanQueryImpl.java:265)
      	at com.liferay.portal.kernel.search.BaseIndexer.addSearchClassTypeIds(BaseIndexer.java:990)
      	at com.liferay.portlet.journal.util.JournalArticleIndexer.postProcessContextQuery(JournalArticleIndexer.java:163)
      

      The problem is that we return the analyzer after successful matching (Pattern.matches(key, fieldName)) but we don't cache it, thus the whole pattern matching (including the costly regexp compilation) is always repeated.

      PerFieldAnalyzer.java
      	public Analyzer getAnalyzer(String fieldName) {
      		Analyzer analyzer = _analyzers.get(fieldName);
      
      		if (analyzer != null) {
      			return analyzer;
      		}
      
      		for (String key : _analyzers.keySet()) {
      			if (Pattern.matches(key, fieldName)) {
      				return _analyzers.get(key);
      			}
      		}
      
      		return _analyzer;
      	}
      

        Attachments

          Activity

            People

            • Assignee:
              support-lep@liferay.com SE Support
              Reporter:
              istvan.sajtos Istvan Sajtos
              Participants of an Issue:
              Recent user:
              Liferay JIRA Bot
            • Votes:
              0 Vote for this issue
              Watchers:
              2 Start watching this issue

              Dates

              • Created:
                Updated:
                Days since last comment:
                31 weeks, 5 days ago

                Packages

                Version Package