Uploaded image for project: 'PUBLIC - Liferay Portal Community Edition'
  1. PUBLIC - Liferay Portal Community Edition
  2. LPS-72884

Japanese text in the DM content field is being sent to the English analyzer in Elasticsearch

    Details

      Description

      Steps to reproduce:

      1. Add a Documents and Media portlet to a page
      2. Upload the attached text files
      3. Search for 新規

      Expected Result:
      Only Japanese1.txt is returned in the search results.

      Actual Result:
      Japanese1.txt and Japanese2.txt are returned in the search results. 新規 is considered a full word by the kuromoji analyzer, so only full matches should return. See JournalArticleIndexerLocalizedContentTest.java for reference.

      Reproduced on:
      Tomcat 8.0.32 + MySQL 5.6.
      Portal master GIT ID: b07d428fca59234300e90b52b2dba63b3d89c1b3
      Portal ee-7.0.x GIT ID: 3d74b73e65685db9f0090608ccc07f454d026afe

      QA/Solution Notes
      Documents and Media assets must be reindexed through the Server Administration to make it work for existing documents.

        Attachments

        1. Japanese1.txt
          0.5 kB
        2. Japanese2.txt
          0.5 kB
        3. Japanese3.txt
          0.5 kB

          Issue Links

            Activity

              People

              • Votes:
                0 Vote for this issue
                Watchers:
                2 Start watching this issue

                Dates

                • Created:
                  Updated:
                  Resolved:
                  Days since last comment:
                  2 years, 22 weeks, 2 days ago

                  Packages

                  Version Package
                  7.0.0 DXP FP31
                  7.0.5 CE GA6
                  7.0.X
                  7.1.X
                  Master