Re: how to replace a substring in a string using C?



On Tue, 08 Nov 2005 00:32:27 +0100, Skarmander wrote:
> Netocrat wrote:
>> On Mon, 07 Nov 2005 18:34:03 +0100, Skarmander wrote:
>>>Netocrat wrote:
[...]
>> [O]ne reason I decided against [an implementation of a string
>> replacement loop using strstr and strcpy] is that it iterates over the
>> string redundantly - once for strstr to find a match and once for
>> memcpy to copy the searched-over part of the string.
>
> This conveniently ignores the fact that you issue a call to memcmp() for
> every byte you're iterating over.

Yes, if you look at it like that, the first version has redundant
operations too.

[...]
>> So we agree - it may or may not be faster. I'll grant that for large
>> strings it may well be faster, because the implementation can likely
>> make use of implementation-specific hardware to copy large blocks of
>> memory.
>
> For small strings it may well be faster too, because the cache then
> becomes your friend. Copying blocks is a *very* efficient procedure on
> most implementations. Compare-and-copy-if-not-equal-repeat, as expressed
> in C, typically is not.

If I get inspired I'll do some benchmarking and post results.

[...]
>>>>by 6.5.6#9 if the result of subtracting two pointers does not fit in a
>>>>ptrdiff_t type the behaviour is undefined.
>>
> Hold on, I did read it, but let's not exaggerate this. Of course the
> standard doesn't guarantee it, but there will be some meaningful range
> of values for which it holds. Otherwise there would be absolutely no
> point to pointer arithmetic ever, no?

Sure, it just means you can't rely on it for objects of arbitrary size
unless you do a check that PTRDIFF_MAX >= SIZE_MAX (I know I'm not telling
you anything new, just clarifying it). It's possible for static objects
to be larger than SIZE_MAX but a program containing such objects is not
strictly conforming (it exceeds an environmental limit).

[...]
> It's actually unfriendly of the standard library not to give us mem*()
> functions that can operate on pointer segments,

I'm not sure what you mean by a pointer segment.

> or str*() functions that return indices.

Sometimes such as for this function that would be useful, yes.

[a string replacement function variant that splits the non-const qualified
source string into segments of PTRDIFF_MAX]
> OK, now, I agree, this goes too far. This reaches the point of "not
> worth it". *If* things are such that ptrdiff_t isn't large enough, a
> custom strstr() should be used.

Unless the constraint against const qualified strings and string literals
were a problem, I'd prefer the segmented function to a custom strstr
since, as you accept, that one is subject to potential performance
reduction that the segmented one isn't.

> If that slows things down too much, the byte-copying variant should be
> used. This code is too complex, and needlessly constrained on platforms
> that have no ptrdiff_t problem.

It is complex although the final result is readable (by me at least) and
understandable by others through commenting (I hope).

--
http://members.dodo.com.au/~netocrat
.



Relevant Pages

  • Re: how to replace a substring in a string using C?
    ... the loop to use strstr less wastefully would introduce redundant iteration over the string whilst copying, for which arguments similar to the above apply. ... The iteration of my version is no more redundant than the fact that you have do a memcmpon the first byte, then a copy if it doesn't match works). ...
    (comp.lang.c)
  • Re: how to replace a substring in a string using C?
    ... neglected to replace strstr with a more optimal function. ... Moving the loop increment into the else statement makes a lot of sense by ... it seems to be a waste to iterate over the string twice. ... char *ret, *sr; ...
    (comp.lang.c)
  • Re: need some help
    ... a pointer to the first occurrence of 1st string in 2nd string or NULL ... char *ashstrstr(char *haystack, char *needle) ... It should be strstr(needle, haystack). ... the student to write the internals of strstr. ...
    (comp.lang.c)
  • Re: const char*
    ... pointer into the second string and lets me change the string. ... This implies that the const indicates what strstr() will do but that the ... I can alter the contents of s2 through the pointer returned by ... > they always point to a character within their const char ...
    (comp.lang.c)
  • RE: edit text string in column B
    ... Problem is that there are 3 segments to the char string in column B (model ... number - description label - part number), ... In cell Z1 enter the formula: ...
    (microsoft.public.excel.misc)