Creating this ticket to check the "content" and "description" field as the "title" field may have been resolved via
When querying with Elasticsearch, the indexers may be searching with both localized and non-localized fields which will skew the relevancy score/weight.
This issue might also be caused by the BaseIndexer.java and other Indexers as the search query seems to add the non-localized fields into the search query. When searching with Journal Article in addition, both non-localized and localized fields are then searched.
Master BaseIndexer.java, GitId: 5448a6e95e2f429b27725fa9b044075c119d549f
We should (1) determine if localization is needed or not based on which assets are using ES search, otherwise default to the English localization or no localization at all (some portlets that developers might want to index might not have a need for localization), (2) when we have Journal Article and another portlet that does not have localization, the search query in postProcessSearchQuery() seems to include all fields being searched (description and description_en), and/or (3) if possible, nest the search query when there is more than one indexer/asset involved, this way we can limit which fields (localized or not) are being queried in ES
- Tested with Journal Article and Knowledge Base portlets, Journal Article indexer will search both content_en and content fields because Admin Indexer adds the non-localized "content" field as a term to search for
- FacetedSearcher's createFullQuery() loops through all indexer postProcessSearchQuery(), if JournalArticleIndexer has addSearchLocalizedTerm() for description but BlogsEntryIndexer does not have a postProcessSearchQuery() and will by default use BaseIndexer's which is using addSearchTerm() for description, the search query sent to the searchRequestBuilder will contain both description and description_en
FacetedSearcher code snippet: