Throughout Liferay's code, some URLs are encoded using HtmlUtil.escapeAttribute(), some are just using escape().
portal-web/docroot/html/common/themes/top_head.jsp uses escapeAttribute() (but not for the shortcut icon),
portal-web/docroot/html/common/themes/top_portlet_resources_js.jspf is using escape(),
portal-web/docroot/html/common/themes/top_js.jspf is using escape()
portal-web/docroot/html/common/themes/top_portlet_resources_css.jspf is using escape()
and so on and so forth (there are certainly more occurrences).
How come that there is such an inconsistency? All of those usages of escape() are in the context of an HTML attribute, so shouldn't they be escapeAttribute() (or escapeHREF()) instead?
This leads to the second problem, which is related to
LPS-30994, which got closed as "Won't Fix".
A comment on that issue correctly states that encodeAttribute() using character references is perfectly legal.
However I'm facing a problem while using IBM's WebSEAL URL rewriting functionality for SSO. WebSEAL does not detect URLs with character references and does not rewrite them.
While one could argue that this is a problem with WebSEAL and not Liferay, I may suggest another solution. As a reply on the related forum post https://www.liferay.com/de/community/forums/-/message_boards/message/17885135#_19_message_22874046 states, using HtmlUtil.escape() instead of escapeAttribute() "fixes" the problem of character references in URLs. My preferred solution would be using escapeAttribute(), but changing its implementation. I do not know why portal-impl/src/com/liferay/portal/util/HtmlImpl.java is coded as it is, but it should be fairly safe to add some more characters to the exception logic of escape(). When in ESCAPE_MODE_ATTRIBUTE, characters that make up an URL like protocol://host:port/path, i.e. ":" (colon), "/" (forward slash) and "." (dot) for dotted IPv4 address notation of host, do not need to be replaced with character references, since they are ASCII and can be used as-is in HTML attributes anyways. The OWASP recommendation is related to breaking unquoted attributes with certain characters, but those mentioned before are none from the dangerous list.
As to how to reproduce the issue;
- Using Liferay 6.2.0 CE GA1 (most likely 6.1.1 CE GA2 or previous versions as well, see related issue)
- Open any page in Liferay Portal
- Inspect differently encoded/escaped URLs in HTML attributes