Re: trim whitespace



John Kelly wrote:
On Sun, 22 Aug 2010 04:32:31 -0400, Shao Miller <sha0.miller@xxxxxxxxx>
wrote:

You do two C operations for every character. A read, and a store.
Right. Well, a read for every character, and a store for every character after the left trim, plus one store for the terminator.

Are you saying this code


/* Trim left */
while (isspace(*string = *i))
++i;

does not do a read and store on every iteration?
Oops. Missed that change in the paste. My claim should be accurate for:

/* Trim left */
while (isspace(*i))
++i;

Thanks, Mr. J. Kelly. Sorry about that.

I only read each character. The memmove() may be library optimized with
assembly language.
And if it isn't?

Then mine is no worse than yours.
Why not? Would you not have two read passes and one write pass?

Suppose 'memmove' "remaps" the pointer to point directly to the target. Well that'd certainly be a nearly instantaneous "write pass".

Maybe 'strchr' could quickly give you a pointer to the end of the string and you can 'isspace' your way backwards to find the right-hand trim boundary.

The implementations of 'memmove' and 'strchr' are unknowns. How can you guarantee that your own read pass followed by a 'memmove' will be no slower than your own read pass and your own write pass?

Seems intuitive to me.
Ok. Well perhaps you are right. :)

Function calls to C Standard Library functions can have overhead. Checking overlap conditions in a 'memmove' implementation can have overhead. Using a buffer in a 'memmove' implementation can have overhead. Calculating the number of times to copy X bytes-at-a-time followed by the remaining bytes can have overhead. I'm not suggesting that these concerns are valid for your target implementation(s), but they are unknowns, aren't they?

Maybe we should race.
Ok. Gathering statistics seems like a fair suggestion. But really I offered the code because it doesn't worry about 'ptrdiff_t' or 'size_t' and you seemed concerned about that.

If you expect that 'memmove' calls will significantly outperform this code, it's still possible to avoid 'ptrdiff_t'. During your search for the terminator, you could increment a count up to 'SIZE_MAX' and perform a 'memmove' on either termination condition or 'SIZE_MAX' count reached condition.
.