Re: Efficient strcat implementation.



On Apr 22, 11:34 am, Ben Pfaff <b...@xxxxxxxxxxxxxxx> wrote:
Mahesh<mmp....@xxxxxxxxx> writes:
I am looking forefficientstringcancatination c code. I did it using
ptr [using default method] but my code i think is notefficient.

One possible implementation:
strcpy(strchr(dst, '\0'), src);

But here is what the GNU libc manual says about strcat:

----------------------------------------------------------------------

Programmers using the `strcat' or `wcscat' function (or the
following `strncat' or `wcsncar' functions for that matter) can easily
be recognized as lazy and reckless. In almost all situations the
lengths of the participating strings are known (it better should be
since how can one otherwise ensure the allocated size of the buffer is
sufficient?) Or at least, one could know them if one keeps track of the
results of the various function calls. But then it is very inefficient
to use `strcat'/`wcscat'. A lot of time is wasted finding the end of
the destinationstringso that the actual copying can start. This is a
common example:

/* This function concatenates arbitrarily many strings. The last
parameter must be `NULL'. */
char *
concat (const char *str, ...)
{
va_list ap, ap2;
size_t total = 1;
const char *s;
char *result;

va_start (ap, str);
/* Actually `va_copy', but this is the name more gcc versions
understand. */
__va_copy (ap2, ap);

/* Determine how much space we need. */
for (s = str; s != NULL; s = va_arg (ap, const char *))
total += strlen (s);

va_end (ap);

result = (char *) malloc (total);
if (result != NULL)
{
result[0] = '\0';

/* Copy the strings. */
for (s = str; s != NULL; s = va_arg (ap2, const char *))
strcat (result, s);
}

va_end (ap2);

return result;
}

This looks quite simple, especially the second loop where the strings
are actually copied. But these innocent lines hide a major performance
penalty. Just imagine that ten strings of 100 bytes each have to be
concatenated. For the secondstringwe search the already stored 100
bytes for the end of thestringso that we can append the nextstring.
For all strings in total the comparisons necessary to find the end of
the intermediate results sums up to 5500! If we combine the copying
with the search for the allocation we can write this function moreefficient:

char *
concat (const char *str, ...)
{
va_list ap;
size_t allocated = 100;
char *result = (char *) malloc (allocated);

if (result != NULL)
{
char *newp;
char *wp;

va_start (ap, str);

wp = result;
for (s = str; s != NULL; s = va_arg (ap, const char *))
{
size_t len = strlen (s);

/* Resize the allocated memory if necessary. */
if (wp + len + 1 > result + allocated)
{
allocated = (allocated + len) * 2;
newp = (char *) realloc (result, allocated);
if (newp == NULL)
{
free (result);
return NULL;
}
wp = newp + (wp - result);
result = newp;
}

wp = mempcpy (wp, s, len);
}

/* Terminate the resultstring. */
*wp++ = '\0';

/* Resize memory to the optimal size. */
newp = realloc (result, wp - result);
if (newp != NULL)
result = newp;

va_end (ap);
}

return result;
}

With a bit more knowledge about the input strings one could fine-tune
the memory allocation. The difference we are pointing to here is that
we don't use `strcat' anymore. We always keep track of the length of
the current intermediate result so we can safe us the search for the
end of thestringand use `mempcpy'. Please note that we also don't
use `stpcpy' which might seem more natural since we handle with
strings. But this is not necessary since we already know the length of
thestringand therefore can use the faster memory copying function.
The example would work for wide characters the same way.

Whenever a programmer feels the need to use `strcat' she or he
should think twice and look through the program whether the code cannot
be rewritten to take advantage of already calculated results. Again: it
is almost always unnecessary to use `strcat'.

--
"Some people *are* arrogant, and others read the FAQ."
--Chris Dollin

Thanks a lot Chris and all who assisted me in this regard. Much Glad
about response.
.



Relevant Pages

  • Re: Efficient strcat implementation.
    ... Programmers using the `strcat' or `wcscat' function (or the ... /* This function concatenates arbitrarily many strings. ... concat (const char *str, ...) ... if (newp == NULL) ...
    (comp.lang.c)
  • Re: Efficient strcat implementation.
    ... Programmers using the `strcat' or `wcscat' function (or the ... /* This function concatenates arbitrarily many strings. ... concat (const char *str, ...) ... if (newp == NULL) ...
    (comp.lang.c)
  • Re: palindrome iteration
    ... empty strings, for input strings that contain many words with mixed ... So we use iterative loops and single char tests. ... To avoid bugs like ones in your code you may think about the loop ... 'int' object has no attribute 'lower' ...
    (comp.lang.python)
  • Re: Sets and portability (was) Re: Is ISO Pascal compatible with J&W (original) Pascal ?
    ... strings, the user can control the length by the data they process; ... >> The computer world is more complex than it's ever been (eg Unicode) ... The Pascal `Char' type can be this size (unlike C, ... > Note that ansi->wide conversion is codepage sensitive. ...
    (comp.lang.pascal.misc)
  • Re: socket communication: send & receive doesnt work right
    ... But can I actually send and receive strings? ... I did a test of sending two doubles: ... public void send_doubles(double vals, int len) throws ... char *result; ...
    (microsoft.public.win32.programmer.networks)