[jira] Created: (LANG-517) Define standard for escape/unescape HTML

6 messages Options
Embed this post
Permalink
JIRA jira@apache.org

[jira] Created: (LANG-517) Define standard for escape/unescape HTML

Reply Threaded More More options
Print post
Permalink
Define standard for escape/unescape HTML
----------------------------------------

                 Key: LANG-517
                 URL: https://issues.apache.org/jira/browse/LANG-517
             Project: Commons Lang
          Issue Type: Sub-task
            Reporter: Henri Yandell




--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

JIRA jira@apache.org

[jira] Updated: (LANG-517) Define standard for escape/unescape HTML

Reply Threaded More More options
Print post
Permalink

     [ https://issues.apache.org/jira/browse/LANG-517?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Henri Yandell updated LANG-517:
-------------------------------

    Fix Version/s: 3.0

> Define standard for escape/unescape HTML
> ----------------------------------------
>
>                 Key: LANG-517
>                 URL: https://issues.apache.org/jira/browse/LANG-517
>             Project: Commons Lang
>          Issue Type: Sub-task
>            Reporter: Henri Yandell
>             Fix For: 3.0
>
>


--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

JIRA jira@apache.org

[jira] Commented: (LANG-517) Define standard for escape/unescape HTML

Reply Threaded More More options
Print post
Permalink
In reply to this post by JIRA jira@apache.org

    [ https://issues.apache.org/jira/browse/LANG-517?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12733032#action_12733032 ]

Henri Yandell commented on LANG-517:
------------------------------------

Note LANG-339 - our escape/unescape is not symmetric.

> Define standard for escape/unescape HTML
> ----------------------------------------
>
>                 Key: LANG-517
>                 URL: https://issues.apache.org/jira/browse/LANG-517
>             Project: Commons Lang
>          Issue Type: Sub-task
>            Reporter: Henri Yandell
>             Fix For: 3.0
>
>


--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

JIRA jira@apache.org

[jira] Commented: (LANG-517) Define standard for escape/unescape HTML

Reply Threaded More More options
Print post
Permalink
In reply to this post by JIRA jira@apache.org

    [ https://issues.apache.org/jira/browse/LANG-517?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12773987#action_12773987 ]

Henri Yandell commented on LANG-517:
------------------------------------

HTML escaping:

public static final CharSequenceTranslator ESCAPE_HTML3 =
new AggregateTranslator(
new LookupTranslator(EntityArrays.BASIC_ESCAPE()),
new LookupTranslator(EntityArrays.ISO8859_1_ESCAPE()),
NumericEntityEscaper.above(0x7f)
);

public static final CharSequenceTranslator ESCAPE_HTML4 =
new AggregateTranslator(
new LookupTranslator(EntityArrays.BASIC_ESCAPE()),
new LookupTranslator(EntityArrays.ISO8859_1_ESCAPE()),
new LookupTranslator(EntityArrays.HTML40_EXTENDED_ESCAPE()),
NumericEntityEscaper.above(0x7f)
);

HTML unescaping:

public static final CharSequenceTranslator UNESCAPE_HTML3 =
new AggregateTranslator(
new LookupTranslator(EntityArrays.BASIC_UNESCAPE()),
new LookupTranslator(EntityArrays.ISO8859_1_UNESCAPE()),
new NumericEntityUnescaper()
);

public static final CharSequenceTranslator UNESCAPE_HTML4 =
new AggregateTranslator(
new LookupTranslator(EntityArrays.BASIC_UNESCAPE()),
new LookupTranslator(EntityArrays.ISO8859_1_UNESCAPE()),
new LookupTranslator(EntityArrays.HTML40_EXTENDED_UNESCAPE()),
new NumericEntityUnescaper()
);

Major question raised is why are we escaping numeric entities above 0x7f. Also request to escape below 0x20.

> Define standard for escape/unescape HTML
> ----------------------------------------
>
>                 Key: LANG-517
>                 URL: https://issues.apache.org/jira/browse/LANG-517
>             Project: Commons Lang
>          Issue Type: Sub-task
>            Reporter: Henri Yandell
>             Fix For: 3.0
>
>


--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

JIRA jira@apache.org

[jira] Commented: (LANG-517) Define standard for escape/unescape HTML

Reply Threaded More More options
Print post
Permalink
In reply to this post by JIRA jira@apache.org

    [ https://issues.apache.org/jira/browse/LANG-517?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12777870#action_12777870 ]

Henri Yandell commented on LANG-517:
------------------------------------

I think the best is to do the minimum, as it's easy to add additional escapes for 0x7f and 0x20. So with respect to the above, the NumericEntityEscaper sections would be removed from both escape options. Unescape would look the same.

> Define standard for escape/unescape HTML
> ----------------------------------------
>
>                 Key: LANG-517
>                 URL: https://issues.apache.org/jira/browse/LANG-517
>             Project: Commons Lang
>          Issue Type: Sub-task
>            Reporter: Henri Yandell
>             Fix For: 3.0
>
>


--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

JIRA jira@apache.org

[jira] Closed: (LANG-517) Define standard for escape/unescape HTML

Reply Threaded More More options
Print post
Permalink
In reply to this post by JIRA jira@apache.org

     [ https://issues.apache.org/jira/browse/LANG-517?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Henri Yandell closed LANG-517.
------------------------------

    Resolution: Fixed

I've changed it so > 0x7f are not escaped to numeric values.

> Define standard for escape/unescape HTML
> ----------------------------------------
>
>                 Key: LANG-517
>                 URL: https://issues.apache.org/jira/browse/LANG-517
>             Project: Commons Lang
>          Issue Type: Sub-task
>            Reporter: Henri Yandell
>             Fix For: 3.0
>
>


--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.