Details

    • Type: Improvement Improvement
    • Status: Contributed Solution
    • Priority: Minor Minor
    • Resolution: Unresolved
    • Affects Version/s: 4.3.3
    • Fix Version/s: 5.1.0
    • Component/s: None
    • Labels:
      None
    • Environment:
      JVM: 1.4.2_12
      Liferay version: 4.3.2
    • Similar Issues:
      Show 5 results 

      Description

      I have seen that portal pages are not XHTML validate; the most of these are not validate becouse the & is not written as "& amp;". I have tried to solve it by using a jtidy filter; JTidy is an API for XTML-iyng the pages.
      I have modified the web.xml of ROOT.war and i have added these rows:

      <!-- Angelo aggiunta filtro Jtidy per XHTML -->
      <filter>
      <filter-name>JTidyFilter</filter-name>
      <description>Filtro per validare XHTML le pagine</description>
      <filter-class>org.w3c.tidy.servlet.filter.JTidyFilter</filter-class>
      <init-param>
      <param-name>properties.filename</param-name>
      <param-value>scfconfiguration.properties</param-value>
      </init-param>
      </filter>
      <!-- Angelo fine aggiunta filtro Jtidy; il resto del web.xml รจ originale -->
      <!-- Angelo aggiunta del mapping sul filtro -->
      <filter-mapping>
      <filter-name>JTidyFilter</filter-name>
      <url-pattern>/c/portal/c</url-pattern>
      </filter-mapping>
      <filter-mapping>
      <filter-name>JTidyFilter</filter-name>
      <url-pattern>/group/*</url-pattern>
      </filter-mapping>
      <filter-mapping>
      <filter-name>JTidyFilter</filter-name>
      <url-pattern>/user/*</url-pattern>
      </filter-mapping>
      <filter-mapping>
      <filter-name>JTidyFilter</filter-name>
      <url-pattern>/web/*</url-pattern>
      </filter-mapping>
      <!-- Angelo fine aggiunta del mapping sul filtro -->
      And in my portal.ear in the lib directory i have added the file jtidyallinone-r8-SNAPSHOT.jar

        Activity

        Hide
        Angelo Immediata added a comment -

        This is the jtidy jar i used in my classpath

        Show
        Angelo Immediata added a comment - This is the jtidy jar i used in my classpath
        Hide
        Angelo Immediata added a comment -

        I have added 2 files; one is the Jar i used; another one is my eclipse project of tidy; i have done some modified to jtidy since it didn't load the properties file correctly.
        There are some things i must check by using this solution.
        The greatest one is that:
        By using the JTidy filter the captcha image doesn't appear; i must try to solve this problem but in this period i have not enough time for debugging it; i'll try in 2 ar 3 weeks.
        Any suggestion, comments are really appreciated

        Show
        Angelo Immediata added a comment - I have added 2 files; one is the Jar i used; another one is my eclipse project of tidy; i have done some modified to jtidy since it didn't load the properties file correctly. There are some things i must check by using this solution. The greatest one is that: By using the JTidy filter the captcha image doesn't appear; i must try to solve this problem but in this period i have not enough time for debugging it; i'll try in 2 ar 3 weeks. Any suggestion, comments are really appreciated
        Hide
        Brian Chan added a comment -

        The only thing is, jtidy hasn't been released since 2001. I'm hesitant to use something from a project that's stale. Is there an alternative?

        Show
        Brian Chan added a comment - The only thing is, jtidy hasn't been released since 2001. I'm hesitant to use something from a project that's stale. Is there an alternative?
        Hide
        Angelo Immediata added a comment -

        Well...i know there is another xhtml parser (Nekohtml); i have never used it; i'll try to see if it can be usefull....but during this period i'm using JTidy and it seems to work pretty good.
        I'll let you know.
        Regards,
        Angelo

        Show
        Angelo Immediata added a comment - Well...i know there is another xhtml parser (Nekohtml); i have never used it; i'll try to see if it can be usefull....but during this period i'm using JTidy and it seems to work pretty good. I'll let you know. Regards, Angelo
        Hide
        Brian Chan added a comment -

        if it's stable, then that's ok...but I'd rather use a project that's still alive Thanks.

        Show
        Brian Chan added a comment - if it's stable, then that's ok...but I'd rather use a project that's still alive Thanks.
        Hide
        Angelo Immediata added a comment -

        Hi All.

        I have done some debugs on JTidy filter and i have done some correction to some little bugs; infact before we can't download files form forum and blogs and so on.
        I attach to this thread my new jar file, more my eclipse project;
        I have used a class called "ImageRecognizer" for understanding if the stream bytes are images or other things
        Besides by using jtidy jsp files must be written more carefully...for example i saw that if i want to add a role to the
        portal system when i have the tidy filter i can't see the save buttons while without tidy i can see them
        This becouse the "editRole.jsp" has the tag "<select>" not closed and tidy is not able in displaying all it.
        I have modified this page and now it seems to work.

        In the file "jtidyallinone-r8-SNAPSHOT.jar" there is the file "scfConfiguration.properties"; here there are jtidy configurations
        I try to explain these configuration properties:

        trim-empty-elements=false used for telling jtidy to not trim empty elements (for example there are some style classes that we need and they just do nothing)
        doctype=strict used for telling jtidy to use the strict doctype
        wrap=10000 used for telling jtidy that it must wrap lines around 10000 chars
        logValidationMessages=false used for telling jtidy to not log the validation messages
        tidy-mark=false used for telling jtidy to not add the tidy mark in the answer
        output-xhtml=true used for telling jtidy to generate an xhtml output (this is important since by this option jtidy put the CDATA element in the script tag)
        input-encoding=UTF-8 this tells jtidy that the character in input are codified in UTF-8 (if we don't use this property there are great problems in italian special charcater)
        show-errors=1000 this tells jtidy to show the first 1000 errors, but it's not an important property

        I add also my web.xml in root.war

        This is my jtidy filter configuration in my web.xml:

        <filter>
        <filter-name>JTidyFilter</filter-name>
        <description>Filtro per validare XHTML le pagine</description>
        <filter-class>org.w3c.tidy.servlet.filter.JTidyFilter</filter-class>
        <init-param>
        <param-name>properties.filename</param-name>
        <param-value>scfconfiguration.properties</param-value>
        </init-param>
        <init-param>
        <param-name>isCommittedFix</param-name>
        <param-value>true</param-value>
        </init-param>
        <init-param>
        <param-name>defferedStreamClose</param-name>
        <param-value>true</param-value>
        </init-param>
        <init-param>
        <param-name>tee</param-name>
        <param-value>false</param-value>
        </init-param>
        <init-param>
        <param-name>config</param-name>
        <param-value>indent: auto; indent-spaces: 0</param-value>
        </init-param>
        </filter>

        The init parametrs are usefull for these things:

        properties.filename the property file name where there are the configurations
        isCommittedFix setted to true for avoiding some errors when there is a forward done by Struts
        defferedStreamClose setted to true for downloading the files
        tee setted to false becosue when it's true the original answer is not modified
        indent: auto; indent-spaces: 0 this tells jtidy to auto intend the result and the intendation space is 0 (this for avoiding to have characters intentdation in textarea)

        I hope to have been clear....my english is not the best.
        I'll try to debug more and the next step i'ld like to do is to create a "wrapper" for loading jtidy configuration from portal.properties or portal-ext.properties.

        I have done a unique zip file with all the other attachments.
        Regards,
        Angelo.

        Show
        Angelo Immediata added a comment - Hi All. I have done some debugs on JTidy filter and i have done some correction to some little bugs; infact before we can't download files form forum and blogs and so on. I attach to this thread my new jar file, more my eclipse project; I have used a class called "ImageRecognizer" for understanding if the stream bytes are images or other things Besides by using jtidy jsp files must be written more carefully...for example i saw that if i want to add a role to the portal system when i have the tidy filter i can't see the save buttons while without tidy i can see them This becouse the "editRole.jsp" has the tag "<select>" not closed and tidy is not able in displaying all it. I have modified this page and now it seems to work. In the file "jtidyallinone-r8-SNAPSHOT.jar" there is the file "scfConfiguration.properties"; here there are jtidy configurations I try to explain these configuration properties: trim-empty-elements=false used for telling jtidy to not trim empty elements (for example there are some style classes that we need and they just do nothing) doctype=strict used for telling jtidy to use the strict doctype wrap=10000 used for telling jtidy that it must wrap lines around 10000 chars logValidationMessages=false used for telling jtidy to not log the validation messages tidy-mark=false used for telling jtidy to not add the tidy mark in the answer output-xhtml=true used for telling jtidy to generate an xhtml output (this is important since by this option jtidy put the CDATA element in the script tag) input-encoding=UTF-8 this tells jtidy that the character in input are codified in UTF-8 (if we don't use this property there are great problems in italian special charcater) show-errors=1000 this tells jtidy to show the first 1000 errors, but it's not an important property I add also my web.xml in root.war This is my jtidy filter configuration in my web.xml: <filter> <filter-name>JTidyFilter</filter-name> <description>Filtro per validare XHTML le pagine</description> <filter-class>org.w3c.tidy.servlet.filter.JTidyFilter</filter-class> <init-param> <param-name>properties.filename</param-name> <param-value>scfconfiguration.properties</param-value> </init-param> <init-param> <param-name>isCommittedFix</param-name> <param-value>true</param-value> </init-param> <init-param> <param-name>defferedStreamClose</param-name> <param-value>true</param-value> </init-param> <init-param> <param-name>tee</param-name> <param-value>false</param-value> </init-param> <init-param> <param-name>config</param-name> <param-value>indent: auto; indent-spaces: 0</param-value> </init-param> </filter> The init parametrs are usefull for these things: properties.filename the property file name where there are the configurations isCommittedFix setted to true for avoiding some errors when there is a forward done by Struts defferedStreamClose setted to true for downloading the files tee setted to false becosue when it's true the original answer is not modified indent: auto; indent-spaces: 0 this tells jtidy to auto intend the result and the intendation space is 0 (this for avoiding to have characters intentdation in textarea) I hope to have been clear....my english is not the best. I'll try to debug more and the next step i'ld like to do is to create a "wrapper" for loading jtidy configuration from portal.properties or portal-ext.properties. I have done a unique zip file with all the other attachments. Regards, Angelo.
        Hide
        Angelo Immediata added a comment -

        Hi all.

        I have modified the source code of PortalUtil class and the source code fo TidyFilter class for taking properties from the portal.properties file and for enabling and disabling the tidy filter use from the system.properties file.

        I attach here the two classes (the tidy filter class is the only class I modified from the older version I gave you on Jira) and the two properties files.

        I have done some test and it seems to work. When I have more time I'll test it more.

        I want to share with you another point: polls.

        I have seen that the administrato always can modify the polls also when there are some votes; according to me is more correct that the administrator can't modify the vote; now I have modified my portal by checking if there is one vote (last_vote_date != null); if the column last_vote_date is not null in my portal the administrator can't modify the poll.

        I think that a better solution is to give to the administrator the possibility to choose when the poll is "live"; when the poll is live the admoinistrator can't modify it; for modify it he/she must make the poll not live and modify; the poll can be setted offline if and only if there is no vote associated to the poll...

        I hope I have been clear in explaining my idea J

        My English is not too good.

        All the best.

        Regards,

        Angelo.

        Show
        Angelo Immediata added a comment - Hi all. I have modified the source code of PortalUtil class and the source code fo TidyFilter class for taking properties from the portal.properties file and for enabling and disabling the tidy filter use from the system.properties file. I attach here the two classes (the tidy filter class is the only class I modified from the older version I gave you on Jira) and the two properties files. I have done some test and it seems to work. When I have more time I'll test it more. I want to share with you another point: polls. I have seen that the administrato always can modify the polls also when there are some votes; according to me is more correct that the administrator can't modify the vote; now I have modified my portal by checking if there is one vote (last_vote_date != null); if the column last_vote_date is not null in my portal the administrator can't modify the poll. I think that a better solution is to give to the administrator the possibility to choose when the poll is "live"; when the poll is live the admoinistrator can't modify it; for modify it he/she must make the poll not live and modify; the poll can be setted offline if and only if there is no vote associated to the poll... I hope I have been clear in explaining my idea J My English is not too good. All the best. Regards, Angelo.
        Hide
        Giulio Ferrara added a comment -

        hi,
        i'm using your great filter to generate xhtml strict, but i found a possible bug with liferay 4.4.0 :
        the parser has problems in a CDATA block. In particular it considers '<' in a logical expression as the first char of an open xml tag, so it generates error in the parsed javascript code.
        For example in the journal portlet if you try to add a struct you can see that the "add row" button doesn't work. In the following i copy the bad parsed javascript code that generates this error:

        <script type="text/javascript">
        //<![CDATA[
        var xmlIndent = " ";
        function _15_getXsd(cmd, elCount) {
        if (cmd == null)

        { cmd = "add"; }

        var xsd = "<root>\n";
        if ((cmd == "add") && (elCount == -1))

        { xsd += "<dynamic-element name='' type=''></dynamic-element>\n" }

        for (i = 0; i >= 0; i++) {
        var elDepth = document.getElementById("_15_structure_el" + i + "_depth");
        var elName = document.getElementById("_15_structure_el" + i + "_name");
        var elType = document.getElementById("_15_structure_el" + i + "_type");
        if ((elDepth != null) && (elName != null) && (elType != null)) {
        var elDepthValue = elDepth.value;
        var elNameValue = elName.value;
        var elTypeValue = elType.value;
        if ((cmd == "add") || ((cmd == "remove") && (elCount != i))) {
        for (var j = 0; j <=elDepthValue;j++)

        {xsd+=xmlIndent;}xsd+="<dynamic-element name='"elNameValue"' type='"elTypeValue"'>";if((cmd=="add")&&(elCount==i)){xsd+="<dynamic-element name='' type=''></dynamic-element>\n";}varnextElDepth=document.getElementById("_15_structure_el"(i+1)"_depth");if(nextElDepth!=null){nextElDepthValue=nextElDepth.value;if(elDepthValue==nextElDepthValue){for(varj=0;j<elDepthValue;j++){xsd+=xmlIndent;}

        xsd+="</dynamic-element>\n";}elseif(elDepthValue>nextElDepthValue){vardepthDiff=elDepthValue-nextElDepthValue;for(varj=0;j<=depthDiff;j+){if(j!=0){for(vark=0;k<=depthDiff-j;k){xsd=xmlIndent;}}xsd+="</dynamic-element>\n";}}else{xsd+="\n";}}else{for(varj=0;j<=elDepthValue;j+){if(j!=0){for(vark=0;k<=elDepthValue-j;k){xsd=xmlIndent;}}xsd+="</dynamic-element>\n";}}}elseif((cmd=="remove")&&(elCount==i)){varnextElDepth=document.getElementById("_15_structure_el"(i+1)"_depth");if(nextElDepth!=null){nextElDepthValue=nextElDepth.value;if(elDepthValue>nextElDepthValue){vardepthDiff=elDepthValue-nextElDepthValue;for(varj=0;j<depthDiff;j++)

        {xsd+="</dynamic-element>\n";}

        }}else{for(varj=0;j<elDepthValue;j+){xsd="</dynamic-element>\n";}}}}else{break;}}xsd+="</root>";returnxsd;}function_15_editElement(cmd,elCount)

        {document._15_fm.scroll.value="_15_xsd";document._15_fm._15_xsd.value=_15_getXsd(cmd,elCount);submitForm(document._15_fm);}

        function_15_moveElement(moveUp,elCount)

        {document._15_fm.scroll.value="_15_xsd";document._15_fm._15_move_up.value=moveUp;document._15_fm._15_move_depth.value=elCount;document._15_fm._15_xsd.value=_15_getXsd();submitForm(document._15_fm);}

        function_15_saveStructure(addAnother)

        {document._15_fm._15_cmd.value="add";document._15_fm._15_structureId.value=document._15_fm._15_newStructureId.value;document._15_fm._15_xsd.value=_15_getXsd();submitForm(document._15_fm);}

        //]]>

        </script>

        As you can see after the '<' in the expression :

        for (var j = 0; j <=elDepthValue;j++)

        the parser stops to work correctly.

        Ciao from Italy.

        Giulio Ferrara

        Show
        Giulio Ferrara added a comment - hi, i'm using your great filter to generate xhtml strict, but i found a possible bug with liferay 4.4.0 : the parser has problems in a CDATA block. In particular it considers '<' in a logical expression as the first char of an open xml tag, so it generates error in the parsed javascript code. For example in the journal portlet if you try to add a struct you can see that the "add row" button doesn't work. In the following i copy the bad parsed javascript code that generates this error: <script type="text/javascript"> //<![CDATA[ var xmlIndent = " "; function _15_getXsd(cmd, elCount) { if (cmd == null) { cmd = "add"; } var xsd = "<root>\n"; if ((cmd == "add") && (elCount == -1)) { xsd += "<dynamic-element name='' type=''></dynamic-element>\n" } for (i = 0; i >= 0; i++) { var elDepth = document.getElementById("_15_structure_el" + i + "_depth"); var elName = document.getElementById("_15_structure_el" + i + "_name"); var elType = document.getElementById("_15_structure_el" + i + "_type"); if ((elDepth != null) && (elName != null) && (elType != null)) { var elDepthValue = elDepth.value; var elNameValue = elName.value; var elTypeValue = elType.value; if ((cmd == "add") || ((cmd == "remove") && (elCount != i))) { for (var j = 0; j <=elDepthValue;j++) {xsd+=xmlIndent;}xsd+="<dynamic-element name='" elNameValue "' type='" elTypeValue "'>";if((cmd=="add")&&(elCount==i)){xsd+="<dynamic-element name='' type=''></dynamic-element>\n";}varnextElDepth=document.getElementById("_15_structure_el" (i+1) "_depth");if(nextElDepth!=null){nextElDepthValue=nextElDepth.value;if(elDepthValue==nextElDepthValue){for(varj=0;j<elDepthValue;j++){xsd+=xmlIndent;} xsd+="</dynamic-element>\n";}elseif(elDepthValue>nextElDepthValue){vardepthDiff=elDepthValue-nextElDepthValue;for(varj=0;j<=depthDiff;j+ ){if(j!=0){for(vark=0;k<=depthDiff-j;k ){xsd =xmlIndent;}}xsd+="</dynamic-element>\n";}}else{xsd+="\n";}}else{for(varj=0;j<=elDepthValue;j+ ){if(j!=0){for(vark=0;k<=elDepthValue-j;k ){xsd =xmlIndent;}}xsd+="</dynamic-element>\n";}}}elseif((cmd=="remove")&&(elCount==i)){varnextElDepth=document.getElementById("_15_structure_el" (i+1) "_depth");if(nextElDepth!=null){nextElDepthValue=nextElDepth.value;if(elDepthValue>nextElDepthValue){vardepthDiff=elDepthValue-nextElDepthValue;for(varj=0;j<depthDiff;j++) {xsd+="</dynamic-element>\n";} }}else{for(varj=0;j<elDepthValue;j+ ){xsd ="</dynamic-element>\n";}}}}else{break;}}xsd+="</root>";returnxsd;}function_15_editElement(cmd,elCount) {document._15_fm.scroll.value="_15_xsd";document._15_fm._15_xsd.value=_15_getXsd(cmd,elCount);submitForm(document._15_fm);} function_15_moveElement(moveUp,elCount) {document._15_fm.scroll.value="_15_xsd";document._15_fm._15_move_up.value=moveUp;document._15_fm._15_move_depth.value=elCount;document._15_fm._15_xsd.value=_15_getXsd();submitForm(document._15_fm);} function_15_saveStructure(addAnother) {document._15_fm._15_cmd.value="add";document._15_fm._15_structureId.value=document._15_fm._15_newStructureId.value;document._15_fm._15_xsd.value=_15_getXsd();submitForm(document._15_fm);} //]]> </script> As you can see after the '<' in the expression : for (var j = 0; j <=elDepthValue;j++) the parser stops to work correctly. Ciao from Italy. Giulio Ferrara
        Hide
        Giulio Ferrara added a comment - - Restricted to

        hi all again,
        just for sharing with you a possible solution for the bug above.

        In the method getCDATA of the class Lexer you have to add the condition:

        else if (begtag &&((c == ';')||(c == '(')||
        (c == ')')||(c == '&')||(c == '|'))) // Cancel start tag

        { start = -1; endtag = false; begtag = false; }

        I try to explain with my poor english:
        with this control every time the parser reads a '<' tries to look next chars for understanding if this '<' is a minus or a XML's tag closed.

        The method should appear like this:

        public Node getCDATA(Node container)
        {
        int c, lastc, start, len, i;
        int qt = 0;
        int esc = 0;
        String str;
        boolean endtag = false;
        boolean begtag = false;

        if (container.isJavaScript())
        { esc = '\\'; }

        this.lines = this.in.getCurline();
        this.columns = this.in.getCurcol();
        this.waswhite = false;
        this.txtstart = this.lexsize;
        this.txtend = this.lexsize;

        lastc = '\0';
        start = -1;

        while ((c = this.in.readChar()) != StreamIn.END_OF_STREAM)
        {
        // treat \r\n as \n and \r as \n
        if (qt > 0)
        {
        // #598860 script parsing fails with quote chars
        // A quoted string is ended by the quotation character, or end of line
        if ((c == '\r' || c == '\n' || c == qt) && (!TidyUtils.toBoolean(esc) || lastc != esc))
        { qt = 0; }
        else if (c == '/' && lastc == '<')
        { start = this.lexsize + 1; // to first letter }

        else if (c == '>' && start >= 0)
        {
        len = this.lexsize - start;

        this.lines = this.in.getCurline();
        this.columns = this.in.getCurcol() - 3;

        report.warning(this, null, null, Report.BAD_CDATA_CONTENT);

        // if javascript insert backslash before /
        if (TidyUtils.toBoolean(esc))
        {
        for (i = this.lexsize; i > start - 1; --i)
        { this.lexbuf[i] = this.lexbuf[i - 1]; }

        this.lexbuf[start - 1] = (byte) esc;
        this.lexsize++;
        }

        start = -1;
        }
        }
        else if (TidyUtils.isQuote(c) && (!TidyUtils.toBoolean(esc) || lastc != esc))
        { qt = c; }
        else if (c == '<')
        { start = this.lexsize + 1; // to first letter endtag = false; begtag = true; }
        /*
        * Giulio Ferrara
        * Fix for parsing CDATA block with logical expressions.
        * 20/03/2008
        */
        else if (begtag &&((c == ';')||(c == '(')||
        (c == ')')||(c == '&')||(c == '|'))) // Cancel start tag
        { start = -1; endtag = false; begtag = false; }

        else if (c == '!' && lastc == '<') // Cancel start tag

        { start = -1; endtag = false; begtag = false; }

        else if (c == '/' && lastc == '<')

        { start = this.lexsize + 1; // to first letter endtag = true; begtag = false; }

        else if (c == '>' && start >= 0) // End of begin or end tag
        {
        int decr = 2;

        if (endtag && ((len = this.lexsize - start) == container.element.length()))
        {

        str = TidyUtils.getString(this.lexbuf, start, len);
        if (container.element.equalsIgnoreCase(str))

        { this.txtend = start - decr; this.lexsize = start - decr; // #433857 - fix by Huajun Zeng 26 Apr 01 break; }

        }

        // Unquoted markup will end SCRIPT or STYLE elements

        this.lines = this.in.getCurline();
        this.columns = this.in.getCurcol() - 3;

        report.warning(this, null, null, Report.BAD_CDATA_CONTENT);
        if (begtag)

        { decr = 1; }

        this.txtend = start - decr;
        this.lexsize = start - decr;
        break;
        }
        // #427844 - fix by Markus Hoenicka 21 Oct 00
        else if (c == '\r')
        {
        if (begtag || endtag)

        { continue; // discard whitespace in endtag }

        c = this.in.readChar();

        if (c != '\n')

        { this.in.ungetChar(c); }

        c = '\n';

        }
        else if ((c == '\n' || c == '\t' || c == ' ') && (begtag || endtag))

        { continue; // discard whitespace in endtag }

        addCharToLexer(c);
        this.txtend = this.lexsize;
        lastc = c;
        }

        if (c == StreamIn.END_OF_STREAM)

        { report.warning(this, container, null, Report.MISSING_ENDTAG_FOR); }

        if (this.txtend > this.txtstart)

        { this.token = newNode(Node.TEXT_NODE, this.lexbuf, this.txtstart, this.txtend); return this.token; }

        return null;
        }

        This fix works for me.
        I hope that it will help you too.

        Ciao

        Show
        Giulio Ferrara added a comment - - Restricted to hi all again, just for sharing with you a possible solution for the bug above. In the method getCDATA of the class Lexer you have to add the condition: else if (begtag &&((c == ';')||(c == '(')|| (c == ')')||(c == '&')||(c == '|'))) // Cancel start tag { start = -1; endtag = false; begtag = false; } I try to explain with my poor english: with this control every time the parser reads a '<' tries to look next chars for understanding if this '<' is a minus or a XML's tag closed. The method should appear like this: public Node getCDATA(Node container) { int c, lastc, start, len, i; int qt = 0; int esc = 0; String str; boolean endtag = false; boolean begtag = false; if (container.isJavaScript()) { esc = '\\'; } this.lines = this.in.getCurline(); this.columns = this.in.getCurcol(); this.waswhite = false; this.txtstart = this.lexsize; this.txtend = this.lexsize; lastc = '\0'; start = -1; while ((c = this.in.readChar()) != StreamIn.END_OF_STREAM) { // treat \r\n as \n and \r as \n if (qt > 0) { // #598860 script parsing fails with quote chars // A quoted string is ended by the quotation character, or end of line if ((c == '\r' || c == '\n' || c == qt) && (!TidyUtils.toBoolean(esc) || lastc != esc)) { qt = 0; } else if (c == '/' && lastc == '<') { start = this.lexsize + 1; // to first letter } else if (c == '>' && start >= 0) { len = this.lexsize - start; this.lines = this.in.getCurline(); this.columns = this.in.getCurcol() - 3; report.warning(this, null, null, Report.BAD_CDATA_CONTENT); // if javascript insert backslash before / if (TidyUtils.toBoolean(esc)) { for (i = this.lexsize; i > start - 1; --i) { this.lexbuf[i] = this.lexbuf[i - 1]; } this.lexbuf [start - 1] = (byte) esc; this.lexsize++; } start = -1; } } else if (TidyUtils.isQuote(c) && (!TidyUtils.toBoolean(esc) || lastc != esc)) { qt = c; } else if (c == '<') { start = this.lexsize + 1; // to first letter endtag = false; begtag = true; } /* * Giulio Ferrara * Fix for parsing CDATA block with logical expressions. * 20/03/2008 */ else if (begtag &&((c == ';')||(c == '(')|| (c == ')')||(c == '&')||(c == '|'))) // Cancel start tag { start = -1; endtag = false; begtag = false; } else if (c == '!' && lastc == '<') // Cancel start tag { start = -1; endtag = false; begtag = false; } else if (c == '/' && lastc == '<') { start = this.lexsize + 1; // to first letter endtag = true; begtag = false; } else if (c == '>' && start >= 0) // End of begin or end tag { int decr = 2; if (endtag && ((len = this.lexsize - start) == container.element.length())) { str = TidyUtils.getString(this.lexbuf, start, len); if (container.element.equalsIgnoreCase(str)) { this.txtend = start - decr; this.lexsize = start - decr; // #433857 - fix by Huajun Zeng 26 Apr 01 break; } } // Unquoted markup will end SCRIPT or STYLE elements this.lines = this.in.getCurline(); this.columns = this.in.getCurcol() - 3; report.warning(this, null, null, Report.BAD_CDATA_CONTENT); if (begtag) { decr = 1; } this.txtend = start - decr; this.lexsize = start - decr; break; } // #427844 - fix by Markus Hoenicka 21 Oct 00 else if (c == '\r') { if (begtag || endtag) { continue; // discard whitespace in endtag } c = this.in.readChar(); if (c != '\n') { this.in.ungetChar(c); } c = '\n'; } else if ((c == '\n' || c == '\t' || c == ' ') && (begtag || endtag)) { continue; // discard whitespace in endtag } addCharToLexer(c); this.txtend = this.lexsize; lastc = c; } if (c == StreamIn.END_OF_STREAM) { report.warning(this, container, null, Report.MISSING_ENDTAG_FOR); } if (this.txtend > this.txtstart) { this.token = newNode(Node.TEXT_NODE, this.lexbuf, this.txtstart, this.txtend); return this.token; } return null; } This fix works for me. I hope that it will help you too. Ciao

          People

          • Votes:
            1 Vote for this issue
            Watchers:
            4 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved:

              Development

                Structure Helper Panel