Uploaded image for project: 'PUBLIC - Liferay Portal Community Edition'
  1. PUBLIC - Liferay Portal Community Edition
  2. LPS-14559

SOLR search seemingly highlights pseudo random results

    Details

      Description

      I spent some time troubleshooting this issues with packet captures, and finally found the solution which I will post below. Here is the observation:

      === how to reproduce ==
      1) added various PDF documents to documents library (the docs would contain the word "test" in them)
      2) SOLR would index the documents fine
      3) Search for "test" in the search portlet
      4) return results would show PDF documents and highlight "test" by wrapping it with <em> </em> tags (as expected)
      5) ... however..., it would also highlight "20" and "2.0". eg <em>20</em> <em>2.0</em> even though I did not put "20" in the search query

      == why 20 and 2.0 were getting highlighting ===

      • it turns out that that the portletId was also getting passed to the SOLR search web query.. My search portlet happened to be "20".. There is another companyID parameter that get's passed as well, however, I didn't see any false return hits because it was 10365. However, if a document had 10365 in it, it would also get highlighted.

      == how to fix it ====

      • the hl.requireFieldMatch search parameter must be set to true
      • modify: src/com/liferay/portal/search/solr/SolrIndexSearcherImpl.java (around line 238. translateQuery() function)
      • add: solrQuery.setRequireFieldMatch(true);

      === other note ===

      • this can be a particularly annoying bug because many documents seem to contain "20" or "20" or "%20".
      • it would be nice to be able to set all of the solrQuery.setBlah() parameters in a .properties file somewhere.. all of the properties are hard-coded right now.

        Attachments

        1. fixed.jpg
          fixed.jpg
          145 kB
        2. reproduced.jpg
          reproduced.jpg
          135 kB
        3. trunk87531SOLR.png
          trunk87531SOLR.png
          90 kB

          Activity

            People

            • Votes:
              4 Vote for this issue
              Watchers:
              9 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved:
                Days since last comment:
                5 years, 12 weeks, 6 days ago

                Packages

                Version Package
                6.0.X EE
                6.1.X EE
                6.2.X EE
                7.0.0 M2