Uploaded image for project: 'PUBLIC - Liferay Portal Community Edition'
  1. PUBLIC - Liferay Portal Community Edition
  2. LPS-70315

Journal Article non-localized content field is being searched in Elasticsearch when searching with other Assets/portlets

    Details

    • Fix Priority:
      4

      Description

      Creating this ticket to check the "content" and "description" field as the "title" field may have been resolved via LPS-67687

      When querying with Elasticsearch, the indexers may be searching with both localized and non-localized fields which will skew the relevancy score/weight.

      This issue might also be caused by the BaseIndexer.java and other Indexers as the search query seems to add the non-localized fields into the search query. When searching with Journal Article in addition, both non-localized and localized fields are then searched.

      Master BaseIndexer.java, GitId: 5448a6e95e2f429b27725fa9b044075c119d549f

      	@Override
      	public void postProcessSearchQuery(
      			BooleanQuery searchQuery, BooleanFilter fullQueryBooleanFilter,
      			SearchContext searchContext)
      		throws Exception {
      
      		String keywords = searchContext.getKeywords();
      
      		if (Validator.isNull(keywords)) {
      			addSearchTerm(searchQuery, searchContext, Field.DESCRIPTION, false);
      			addSearchTerm(searchQuery, searchContext, Field.TITLE, false);
      			addSearchTerm(searchQuery, searchContext, Field.USER_NAME, false);
      		}
      	}
      

      We should (1) determine if localization is needed or not based on which assets are using ES search, otherwise default to the English localization or no localization at all (some portlets that developers might want to index might not have a need for localization), (2) when we have Journal Article and another portlet that does not have localization, the search query in postProcessSearchQuery() seems to include all fields being searched (description and description_en), and/or (3) if possible, nest the search query when there is more than one indexer/asset involved, this way we can limit which fields (localized or not) are being queried in ES

      Additional Notes:

      • Tested with Journal Article and Knowledge Base portlets, Journal Article indexer will search both content_en and content fields because Admin Indexer adds the non-localized "content" field as a term to search for
      • FacetedSearcher's createFullQuery() loops through all indexer postProcessSearchQuery(), if JournalArticleIndexer has addSearchLocalizedTerm() for description but BlogsEntryIndexer does not have a postProcessSearchQuery() and will by default use BaseIndexer's which is using addSearchTerm() for description, the search query sent to the searchRequestBuilder will contain both description and description_en

      FacetedSearcher code snippet:

      for (String entryClassName : searchContext.getEntryClassNames()) {
      	indexer.postProcessSearchQuery(
      		searchQuery, fullQueryBooleanFilter, searchContext);
      }
      

        Attachments

          Issue Links

            Activity

              People

              Assignee:
              support-lep@liferay.com SE Support
              Reporter:
              jenny.chen Jenny Chen
              Participants of an Issue:
              Recent user:
              Sophia Zhang
              Votes:
              0 Vote for this issue
              Watchers:
              3 Start watching this issue

                Dates

                Created:
                Updated:
                Days since last comment:
                3 years, 30 weeks, 4 days ago

                  Packages

                  Version Package