Re: How to trim a String trailing spaces, but not leading spaces?



In article <gacdhi$u9$1@xxxxxxxxxxxxxxxxxxxxxxxxx>,
Mark Space <markspace@xxxxxxxxxxxxx> wrote:

John B. Matthews wrote:


As did I. Sadly, they tend to be slow. Curious, I compared the regex to
Lew's loop proposal. The latter is an order of magnitude faster. The two

Micro benchmarks can be a dangerous thing. In your code you it's pretty
likely that the regex search (which takes a string parameter) has to be
compiled each time it runs. I'd be curious what the run time is if the
pattern is precompiled once then re-used. You also only search one very
short string, which may skew the results also.

Excellent point. Not surprisingly, pre-compiling the regex helps, but
the benefit diminishes with increasing string length. The Loop time
approaches the others to within a factor of two, but only for
unreasonably long padding.

<sscce>
import java.lang.StringBuilder;
import java.util.regex.Pattern;

public class RightTrim {

public static void main(String[] args) {
for (int i = 1; i < 5; i++) {
String s = testString((int) Math.pow(10, i));
(new RegEx()).test(s);
(new Compiled()).test(s);
(new Loop()).test(s);
System.out.println();
}
}

private static String testString(int padding) {
String controls = "\t\n\u000B\f\r";
StringBuilder sb = new StringBuilder("Test");
for (int i = 0; i < padding; i++) sb.append(" ");
sb.append(controls);
return sb.toString();
}
}

abstract class Test {
public static final int COUNT = 10000;
public void test(String in) {
long start = System.currentTimeMillis();
for (int i = 0; i < COUNT; i++) rTrim(in);
System.out.println(name()
+ (System.currentTimeMillis() - start));
}
public abstract String rTrim(String in);
public abstract String name();
}

/** @author JBM */
class RegEx extends Test {
public String rTrim( String in ) {
return in.replaceAll("\\s+$", "");
}
public String name() { return "RegExpr: "; }
}

/** @author JBM, MS */
class Compiled extends Test {
private static final Pattern right = Pattern.compile("\\s+$");
public String rTrim( String in ) {
return right.matcher(in).replaceAll("");
}
public String name() { return "Compiled: "; }
}

/** @author Lew */
class Loop extends Test {
public String rTrim( String in ) {
int len = in.length();
while ( len > 0 ) {
if ( ! Character.isWhitespace( in.charAt( --len ))) {
return in.substring( 0, len + 1 );
}
}
return "";
}
public String name() { return "Loop: "; }
}
</sscce>

However, good job doing the benchmarking. It's important to test, and
even a skewed test might be better than none at all.

--
John B. Matthews
trashgod at gmail dot com
home dot woh dot rr dot com slash jbmatthews
.



Relevant Pages

  • Re: Fastest way to search a string for the occurance of a word??
    ... but the OP's question was what's the "Fastest way to search a string ... in all the tests I did here, the Regex was by far superior. ... However, of course, if you've got new regular expressions all ... Sure - but just that extra Match object could be relevant if the search ...
    (microsoft.public.dotnet.languages.csharp)
  • Re: regular expression help
    ... Basically because if you remove everything that is optional in the regex below you end up with an empty regex: ... So the regex engine will try to match on every character in the string: ... , comma doesn't match, but the nothingness in front of it does. ... A quote followed by any sequence of characters that is not a quote, ...
    (microsoft.public.dotnet.framework)
  • Re: Regex optimization
    ... I was hoping that someone with knowledge of the Regex engine could ... match per string for either Regex. ... reluctant modifier, may be slower .*?, +? ... Variable parts will try to capture as much as possible. ...
    (microsoft.public.dotnet.languages.csharp)
  • Re: How to parse output from a command
    ... Does RegEx rely on the .NET framework? ... >> I'm trying to find a way to get any string out of any output. ... >> bypass the DOS command all together, thinking there would be an easy way ...
    (microsoft.public.scripting.vbscript)
  • Re: Regex Capture problem
    ... "learned" my regex using a freeware utility that had slightly different ... was trying to capture instead of. ... I have used Regex utilities before, so I understand the concepts of text ... Function RESub(str As String, SrchFor As String, ReplWith As String) As String ...
    (microsoft.public.excel.programming)