Re: Memset is faster than simple loop?



AndersWang@xxxxxxxxx wrote:
dose anybody here explain to me why memset would be faster than a
simple loop. I doubt about it!

In an int array scenario:

int array[10];

for(int i=0;i<10;i++) //ten loops
array[i]=0;

or

memset(array,0,sizeof(array));

So, what will memset do inside?

There's no guarantee that memset() is faster, it's probably just some
observation a number of people have made. And if it is faster it is
probably due to the implementation using some carefully tuned method
that exploits some features of the processor the program is running
on which the compiler may not be able to find when it compiles the loop.
But that doesn't has to be the case, it's a question of how good the
implementation of memset() is on the one hand and how god the compiler
is on the other hand (and the compiler could even be clever enough to
replace the loop by a single call of memset() and there goes all your
difference in speed;-).

Here is a snippet from MS c-run-time

codes:

void * __cdecl memset (
void *dst,
int val,
size_t count
)
{
void *start = dst;

#if defined (_M_MRX000) || defined (_M_ALPHA) || defined (_M_PPC) ||
defined (_M_IA64)
{
extern void RtlFillMemory( void *, size_t count, char );

RtlFillMemory( dst, count, (char)val );
}
#else /* defined (_M_MRX000) || defined (_M_ALPHA) || defined
(_M_PPC) || defined (_M_IA64) */
while (count--) { //Watch
here!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
*(char *)dst = (char)val;
dst = (char *)dst + 1;
}
#endif /* defined (_M_MRX000) || defined (_M_ALPHA) || defined
(_M_PPC) || defined (_M_IA64) */

return(start);
}

memset initializes a block of memory byte by byte. So, the while loop
will be executed 4*10 times!

That's just one of many implementations and you can't deduce any
general statement from looking at a certain one. And, as you will
notice when you have a close look, memset() seems to be implemented
in a different way depending on the architecture, so you can't
even say how memset() is implemented for "MS c-run-time" but only
for "MS c-run-time" on a certain architecture.

I got confused. Why people still believe memset is faster than loop
and pertains most of scenarios.

It isn't a question of believes. You have to carefully measure the
behaviour for the implementation and architecture you are using.

Regards, Jens
--
\ Jens Thoms Toerring ___ jt@xxxxxxxxxxx
\__________________________ http://toerring.de
.



Relevant Pages

  • Re: Should I use mutex in this context?
    ... can ascertain that it's non-0 at the beginning of the loop. ... void stupid_thread ... Point, in both cases, is whether the compiler can ... keeps the end result the same - an transparent optimization that does ...
    (microsoft.public.vc.language)
  • Re: Loop unrolling
    ... what are typical heuristics for loop unrolling, ... Every compiler will be different. ... void loopcopy; ...
    (comp.programming)
  • Re: reinitialization an array
    ... since you have a char array, you can do a memset), ... void dummy; ... compiler used quadwords for the assignments. ... The function cwas compiled to a loop in the ...
    (comp.lang.c)
  • Re: efficency for not instanciating variables?
    ... > void foo ... > Because I'm always writing such code and am not sure if the compiler would ... Do not in general declare variables at a scope where they are not needed. ... If you replace the "if"-statement with a loop, it might in some cases be ...
    (comp.lang.cpp)
  • Re: vulnerabilities
    ... because the C header that contains the prototype for memset() does ... void memset ... Compiler: ...
    (comp.lang.c)