-
Type:
Bug
-
Status: Closed
-
Resolution: Fixed
-
Affects Version/s: 7.1.X
-
Component/s: Search Infrastructure
-
Branch Version/s:7.1.x
-
Backported to Branch:Committed
-
Story Points:0.25
-
Fix Priority:5
-
Epic Link:
-
Sprint:Search | S02 Sprint 3
-
Git Pull Request:
Fixed as part of
LPS-84665
Summary
After creating a few Japanese language tags, and assigning those tags to Documents and Media assets, search does not work properly. The following example lists steps to reproduce in DM, but it happens in WCM as well.
Summary
Based on the behavior we have observed, it seems that tags are stored in a default index field which uses Standard analyzer which breaks down all Japanese Kanji characters into single characters. This is related to LPS-84665, but instead of Tags, relates to Categories.
Steps to Reproduce
- Unzip vanilla Liferay bundle (clean, no portal-setup.wizard, etc.)
- Initialize Liferay with Japanese as default language
- Restart Server
- After restarting, sign in with test@liferay.com (you can use English for ease of use, i.e. append /en/ after hostname:port)
- Navigate to Product Menu > Liferay DXP (Site) > Categorization > Categories
- Click Add Vocabulary ボキャブラリ in the Japanese field
- Click Add Category and add 東京 in Japanese (note: this is Tokyo)
- Click Add Category and add 京都 in Japanese (note: this is Kyoto)
- Navigate to Product Menu > Liferay DXP (Site) > Content > Web Content > Create a Basic Web Content
- Add Web Content 豊島区 in Japanese, for both the title and Summary. Select 東京 Category in Metadata. (note: Toshima-ku is in Tokyo)
- Add another Web Content 下京区 in Japanese, for both the title and Summary. Select 京都 Category in Metadata. (note: Shimogyo-ku is in Kyoto)
- Change Display setting to Japanese (i.e. append /ja/ after hostname:port)
- Return to the home page, and in the Search widget (top right corner), search for 東京
Actual Result
Even though we searched for 東京, search result includes 豊島区 and 下京区.
This is problematic because all of the results for 東京 should only include 豊島区, and NOT 下京区. In this case, the only commonality between 東京, 京都, and 下京区 is the character 京, so it seems that search is identifying these as search hits.
Expected Behavior
Results for 東京 should only include 豊島区 since this is the only categorization under 東京.
Reproduced in
7.1.x private Commit: 815320372a34faa0ccd0ed1d4989af7d1502c5e6