Uploaded image for project: 'PUBLIC - Liferay Portal Community Edition'
  1. PUBLIC - Liferay Portal Community Edition
  2. LPS-45585

OutOfMemoryError might appear when background index writer threads cannot process DLFileEntries fast enough

    Details

      Description

      Description

      Indexing DL content is prone to OutOfMemoryError. When you have many small files in the DL, indexer background threads might not be able to keep up with processing newly fetch objects from the DB. After a certain point OutOfMemoryError will be thrown.

      Exception in thread "http-bio-9080-exec-2" java.lang.OutOfMemoryError: Java heap space
      	at java.util.Arrays.copyOfRange(Arrays.java:2694)
      	at java.lang.String.<init>(String.java:203)
      	at java.lang.StringBuffer.toString(StringBuffer.java:561)
      	at java.io.StringWriter.toString(StringWriter.java:210)
      	at org.apache.tika.sax.ToTextContentHandler.toString(ToTextContentHandler.java:136)
      	at org.apache.tika.sax.ContentHandlerDecorator.toString(ContentHandlerDecorator.java:173)
      	at org.apache.tika.Tika.parseToString(Tika.java:390)
      	at org.apache.tika.Tika.parseToString(Tika.java:455)
      	at com.liferay.portal.util.FileImpl.extractText(FileImpl.java:370)
      	at com.liferay.portal.kernel.util.FileUtil.extractText(FileUtil.java:201)
      	at com.liferay.portal.kernel.search.DocumentImpl.addFile(DocumentImpl.java:140)
      	at com.liferay.portlet.documentlibrary.util.DLFileEntryIndexer.doGetDocument(DLFileEntryIndexer.java:378)
      	at com.liferay.portal.kernel.search.BaseIndexer.getDocument(BaseIndexer.java:152)
      	at com.liferay.portlet.documentlibrary.util.DLFileEntryIndexer$1.performAction(DLFileEntryIndexer.java:609)
      	at com.liferay.portal.kernel.dao.orm.BaseActionableDynamicQuery.performActionsInSingleInterval(BaseActionableDynamicQuery.java:309)
      	at com.liferay.portal.kernel.dao.orm.BaseActionableDynamicQuery.performActions(BaseActionableDynamicQuery.java:45)
      	at com.liferay.portlet.documentlibrary.util.DLFileEntryIndexer.reindexFileEntries(DLFileEntryIndexer.java:622)
      	at com.liferay.portlet.documentlibrary.util.DLFileEntryIndexer.doReindex(DLFileEntryIndexer.java:531)
      	at com.liferay.portal.kernel.search.BaseIndexer.reindex(BaseIndexer.java:486)
      	at com.liferay.portlet.documentlibrary.util.DLFileEntryIndexer$2.performAction(DLFileEntryIndexer.java:644)
      	at com.liferay.portal.kernel.dao.orm.BaseActionableDynamicQuery.performActionsInSingleInterval(BaseActionableDynamicQuery.java:309)
      	at com.liferay.portal.kernel.dao.orm.BaseActionableDynamicQuery.performActions(BaseActionableDynamicQuery.java:45)
      	at com.liferay.portlet.documentlibrary.util.DLFileEntryIndexer.reindexFolders(DLFileEntryIndexer.java:651)
      	at com.liferay.portlet.documentlibrary.util.DLFileEntryIndexer.doReindex(DLFileEntryIndexer.java:523)
      	at com.liferay.portal.kernel.search.BaseIndexer.reindex(BaseIndexer.java:486)
      	at com.liferay.portal.search.lucene.LuceneIndexer.reindex(LuceneIndexer.java:168)
      	at com.liferay.portal.search.lucene.LuceneIndexer.doReIndex(LuceneIndexer.java:134)
      	at com.liferay.portal.search.lucene.LuceneIndexer.reindex(LuceneIndexer.java:64)
      	at com.liferay.portal.search.lucene.LuceneIndexer.reindex(LuceneIndexer.java:57)
      	at com.liferay.portlet.admin.action.EditServerAction.reindex(EditServerAction.java:384)
      	at com.liferay.portlet.admin.action.EditServerAction.processAction(EditServerAction.java:199)
      	at com.liferay.portal.struts.PortletRequestProcessor.process(PortletRequestProcessor.java:163)
      

      This is a classical producer/consumer problem.

      Workaround

      As a workaround, you could increase the number of worker threads index writer has, but if spare CPU capacity isn't able to compensate the consumer side of the process, OOME will surface again.

      Steps to reproduce

      1) Create 100 piece of 30Mb text files, I used the following script to generate them.

      for i in {1..100}; do \
      	ruby -e 'a=STDIN.readlines;250000.times do; \
      		b=[];12.times do; b << a[rand(a.size)].chomp end; puts b.join(" "); \
      	end' < /usr/share/dict/words > /tmp/dltest/file.$i.txt; \
      done
      

      2) Disable DL content indexing in order not to reproduce the issue too early.

      dl.file.indexing.max.size=0
      

      3) Create a DL folder test and upload these files through WebDAV.

      4) Restart the portal with ...

      a) ... the following properties set

      dl.file.indexing.max.size=100000000
      index.search.writer.max.queue.size=50
      

      b) ... ensuring that you have a large enough heap to accommodate ~1.500Mb (50 * 30Mb) of content to be indexed!

      5) On Control Panel / Server Admin perform a full reindex

      6) You'll receive OOME like above

        Attachments

        1. fixed.png
          fixed.png
          26 kB
        2. fixed62.png
          fixed62.png
          23 kB
        3. heap_usage_after.png
          heap_usage_after.png
          55 kB
        4. heap_usage_before.png
          heap_usage_before.png
          43 kB
        5. liferay.2014-04-23.dominatortree.png
          liferay.2014-04-23.dominatortree.png
          344 kB
        6. liferay.2014-04-23.log
          20 kB
        7. liferay.2014-04-23.screenshot.png
          liferay.2014-04-23.screenshot.png
          78 kB
        8. reproduced.png
          reproduced.png
          61 kB

          Issue Links

            Activity

              People

              • Votes:
                0 Vote for this issue
                Watchers:
                4 Start watching this issue

                Dates

                • Created:
                  Updated:
                  Resolved:
                  Days since last comment:
                  5 years, 8 weeks, 5 days ago

                  Packages

                  Version Package
                  6.1.X EE
                  6.2.2 CE GA3
                  6.2.X EE
                  7.0.0 M3