Details
-
Bug
-
Status: Closed
-
Resolution: Fixed
-
6.2.10 EE GA1, 7.0.0 DXP FP54, 7.1.10 DXP GA1, Master
-
7.2.x, 7.1.x, 7.0.x
-
Committed
-
0.25
-
3
-
Localization
Description
The portal doesn't normalize unicode strings. This pretty much affects everything in the portal, but here's two scenario:
Scenario 1:
- Add a tag with the name ဖြေဆိုခွင့် (U+1016 U+103C U+1031 U+1006 U+102D U+102F U+1001 U+103D U+1004 U+103A U+1037)
- Add a tag with the name ဖြေဆိုခွင့် (U+1016 U+103C U+1031 U+1006 U+102D U+102F U+1001 U+103D U+1004 U+1037 U+103A)
Expected result:
Tags are suppose to be unique. Since the two strings are the same, the portal should prevent the second tag from being created.
Actual result
Both tags are created since the portal thinks the strings are different.
Scenario 2:
- Create a web content with the title ဖြေဆိုခွင့် (U+1016 U+103C U+1031 U+1006 U+102D U+102F U+1001 U+103D U+1004 U+103A U+1037)
- Search for the web content using ဖြေဆိုခွင့် (U+1016 U+103C U+1031 U+1006 U+102D U+102F U+1001 U+103D U+1004 U+1037 U+103A)
Expected result:
Since the two strings are the same, the portal should return the web content in the search result
Actual result
The web content is not return in the search result.
The problem is with the last 2 character U+103A and U+1037 in the string. Both of these characters modifies the characters U+1004 and the order of the 2 modifiers should not matter.
Reference: http://unicode.org/faq/normalization.html#1
Programs should always compare canonical-equivalent Unicode strings as equal
- All input needs to be normalized before persisting to database
- All input needs to be normalized before indexing
- All queries need to be normalized before searching
Attachments
Issue Links
- fixes
-
LPS-91542 AddSchedulerMVCActionCommand unit tests failed due to initializer errors
- Closed