Uploaded image for project: 'PUBLIC - Liferay Portal Community Edition'
  1. PUBLIC - Liferay Portal Community Edition
  2. LPS-46925

Solr plugin should relocate indexing of documents via Tika to Solr



      The document library uses Apache Tika to extract text content from various file types (docs, PDF etc) which is then submitted to the indexer.
      This is intensive on the server, and takes a long time when there are lots of documents.
      When the Solr plugin is used, searching is off loaded to a separate server to free up resource on the main Liferay server.
      However, extraction of document library files is still done on the Liferay server and the extracted text is passed to Solr.
      It would be better if, when the Solr plugin is installed, the index process passes the original file to Solr, and allow Solr to use Apache Tika to extract text from the file on the Solr server. This would free up additional resources on the Liferay server.
      This would need to be optional, as it would require configuration of Tika on the Solr server.




            • Assignee:
              support-lep@liferay.com SE Support
              robinkeith Robin Keith (Inactive)
            • Votes:
              1 Vote for this issue
              1 Start watching this issue


              • Created:


                Version Package