Re: Hundreds of cases in a switch, optimization?



On Sun, 19 Jun 2005 13:32:14 +0800, He Shiming wrote:

> In terms of generated machine code, how does hundreds of cases in a switch
> differ from hundreds of if-elses? Do compilers or processors do any
> optimization on such structured code? Do I need to worry about the
> performance, usually?

I got curious about this so I decided to see what one compiler did on one
architecture. I chose the simplest kind of test that would be the most
likely to optimised -- consecutive integers from 0 to N. I wrote a script
that made programs of this form:

#include <stdlib.h>

extern int f(int x, int nc);

void use_if_stmt(int n)
{
int x = 0;
while (n--) {
int v = f(n, <nc>);
if (v == 0) x++;
else if (v == 1) x++;
}
}

void use_switch_stmt(int n)
{
int x = 0;
while (n--) {
int v = f(n, <nc>);
switch (v) {
case 0: x++; break;
case 1: x++; break;
}
}
}

int main(int argc, char **argv)
{
int n = argc > 1 ? strtol(argv[1], NULL, 0) : 10000000;
use_if_stmt(n);
use_switch_stmt(n);
return 0;
}

where the number of cases (and "if" choices) could be varied (<nc> gets
replaced by this value as well). The function f could be one of "low",
"spread", or "high" and the resulting code was linked with the following:

int spread(int n, int nc) { return n % nc; }
int low (int n, int nc) { return 0; }
int high (int n, int nc) { return n + nc; }

to simulate the situation where the value to be matched was always the
first ("low"); evenly spread over the range of choices ("spread"); or
higher than the highest case ("high"). The resulting programs where
compiled both with (gcc -O2) and without optimisation. gprof was used to
compare the running times of the two functions "use_if_stmt" and
"use_switch_stmt". I simply calculated the ration of the "if" time
to the "switch" time, so a result of 1 means that they both took
the same time. A value of 5 means the if version took five times
as long as the switch version.

Without optimisation the results were:

N cases all low spread all high
2 0.8 1.1 0.8
20 0.8 1.7 5.5
200 0.9 8.1 66.4
2000 0.9 112.6 3042.0

almost exactly as I would have guessed. When the value being matched is
the first case, "if" is always faster. With 200 cases and all of the
value off the top of the case range, "if" takes 66 times longer.

What surprised me was the optimised numbers:

N cases all low spread all high
2 1.1 0.9 1.0
20 1.3 0.7 0.9
200 1.2 0.9 1.1
2000 0.8 1.1 1.0

I think that is quite impressive!

gcc (GCC) 3.4.3 on Pentium(R) M processor 1.70GHz

--
Ben.

.



Relevant Pages

  • Re: I have a question
    ... int j = i; ... It is also considered a good thing to have code that is compilable with a C89 compiler, since very few C compilers are actually C99 compilers. ... That means declaring variables at the top of a block and not after executable statements. ...
    (comp.lang.c)
  • NO_EXT_KEYS and strict ISO/ANSI modes
    ... int foo ... As far as I understand though this macro has been there ... do recent MS C compilers or other common compilers still ... -za99 switch, well C99 support is far from complete but a switch similar ...
    (comp.lang.c)
  • [WATCHDOG] v2.6.27-rc patches
    ... [WATCHDOG] ... Re-structure the ioctl switch call for all drivers as follows: ... module_param(wdt_start, int, 0); ... +static long acq_ioctl(struct file *file, unsigned int cmd, unsigned long arg) ...
    (Linux-Kernel)
  • Re: a little mistake
    ... Also the use of "inline" for that function is redundant. ... compilers will do it anyways and it isn't portable across all compilers ... int main ... Calling function would be part of the users code. ...
    (comp.lang.c)
  • Re: It works without stdio.h !!
    ... most C compilers will assume that a function that has no prototype is an ... extern int, and since you only specify functions that live in the C library, an unresolved ... never declare a variable of type char unless you well and truly believe you need to be ... strcpy, strcat, sprintf etc. with it. ...
    (microsoft.public.vc.mfc)