Uploaded image for project: 'PUBLIC - Liferay Portal Community Edition'
  1. PUBLIC - Liferay Portal Community Edition
  2. LPS-83736

Portal doesn't normalize unicode strings

    Details

      Description

      The portal doesn't normalize unicode strings. This pretty much affects everything in the portal, but here's two scenario:

      Scenario 1:

      1. Add a tag with the name ဖြေဆိုခွင့် (U+1016 U+103C U+1031 U+1006 U+102D U+102F U+1001 U+103D U+1004 U+103A U+1037)
      2. Add a tag with the name ဖြေဆိုခွင့် (U+1016 U+103C U+1031 U+1006 U+102D U+102F U+1001 U+103D U+1004 U+1037 U+103A)

      Expected result:
      Tags are suppose to be unique. Since the two strings are the same, the portal should prevent the second tag from being created.
      Actual result
      Both tags are created since the portal thinks the strings are different.

      Scenario 2:

      1. Create a web content with the title ဖြေဆိုခွင့် (U+1016 U+103C U+1031 U+1006 U+102D U+102F U+1001 U+103D U+1004 U+103A U+1037)
      2. Search for the web content using ဖြေဆိုခွင့် (U+1016 U+103C U+1031 U+1006 U+102D U+102F U+1001 U+103D U+1004 U+1037 U+103A)

      Expected result:
      Since the two strings are the same, the portal should return the web content in the search result
      Actual result
      The web content is not return in the search result.


      The problem is with the last 2 character U+103A and U+1037 in the string. Both of these characters modifies the characters U+1004 and the order of the 2 modifiers should not matter.

      Reference: http://unicode.org/faq/normalization.html#1

      Programs should always compare canonical-equivalent Unicode strings as equal


      • All input needs to be normalized before persisting to database
      • All input needs to be normalized before indexing
      • All queries need to be normalized before searching

        Attachments

        1. Fixed Search.png
          Fixed Search.png
          47 kB
        2. Fixed Tag.png
          Fixed Tag.png
          45 kB
        3. Search.png
          Search.png
          55 kB
        4. Tag.png
          Tag.png
          36 kB

          Issue Links

            Activity

              People

              • Votes:
                0 Vote for this issue
                Watchers:
                8 Start watching this issue

                Dates

                • Created:
                  Updated:
                  Resolved:
                  Days since last comment:
                  1 year, 5 days ago

                  Packages

                  Version Package
                  7.0.0 DXP FP59
                  7.0.0 DXP SP9
                  7.0.X
                  7.1.10 DXP FP2
                  7.1.1 CE GA2
                  7.1.10.1 SP1
                  7.1.X
                  Master