> Hi Scott,
>
> I just use something like this:
>
> s = s.replaceAll("\\s+", " ");
>
> or since you are doing unicode:
>
> String s = "This\u0200\u0200is\u0200a\u0200\u0200test";
> System.out.println("before=" + s);
> s = s.replaceAll("\u0200+", "\u0200");
> System.out.println("after=" + s);
>
> Gives me this:
> before=ThisȀȀisȀaȀȀtest
> after=ThisȀisȀaȀtest
>
> Of course, you lose the null checking that commons-lang gives you.
> Using
> CharsetUtils.squeeze() also gives me identical results...
>
> String s = "This\u0200\u0200is\u0200a\u0200\u0200test";
> System.out.println("before=" + s);
> s = org.apache.commons.lang.CharSetUtils.squeeze(s, new String[]
> {"\u0200"});
> System.out.println("after=" + s);
>
> Also changed your subject line to include [lang] per guidelines on
> this
> list.
>
> -sujit
>
> On Thu, 2009-10-29 at 16:21 +0000, Scott Wilson wrote:
>> Hi everyone,
>>
>> I need to implement a W3C processing algorithm which states:
>>
>> 10.1.8 Rule for Getting Text Content with Normalized White Space
>> The rule for getting text content with normalized white space is
>> given
>> in the following algorithm. The algorithm always returns a string,
>> which MAY be empty.
>>
>> • Let input be the Element to be processed.
>> • Let result be the result of applying the rule for getting text
>> content to input.
>> • In result, convert any sequence of one or more Unicode white
>> space
>> characters into a single U+0020 SPACE.
>> • Return result.
>>
>> The step I'm having problems with is "convert any sequence of one or
>> more Unicode white space characters into a single U+0020 SPACE."
>>
>> The StringUtils replace() and CharSetUtils squeeze() methods would
>> seem to be best suited for solving this one, but there doesn't seem
>> to
>> be a set syntax for easily specifying unicode white space chars
>> defined for one thing.
>>
>> Has anyone else solved a similar problem using commons lang, or
>> should
>> I consider using something else?
>>
>> Thanks!
>>
>> S
>>
>>
>> /-/-/-/-/-/
>> Scott Wilson
>> Apache Wookie:
http://incubator.apache.org/projects/wookie.html>>
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail:
[hidden email]
> For additional commands, e-mail:
[hidden email]
>