Re: COMPARE HLL/ASM
- From: "Wolfgang Kern" <nowhere@xxxxxxxx>
- Date: Wed, 26 Dec 2007 03:58:23 +0100
santosh wrote:
....
If you don't mind, I specify in more detail:
* string is zero-padded fix-sized, no leading spaces ,no end-mark and
UPPERCASE ie: "FEDCBA98"
* no error indication if any non hex-characters are found.
* just for compatibilty with oldies: no SSE-instructions allowed.
Okay. But for my C code the above conditions are already too late
because I finished coding it before I saw this post. I accept variable
length strings upto ffffffffffffffff with optional leading zero
padding. Both upper and lower case are accepted.
Invalid or malformed strings are rejected, but not by the actual
conversion routine, but when program starts up. This takes place before
the conversion loop is entered, so the time spent is irrelevant.
OK. I'll double my code to make it 128->64 bit and add the U/L-case yet.
And we just measure conversion time.
I used no optimisations for compiling.
* there are no C-functions available on most ASM tools,
so RDTSC seems to be the only choice.
Okay. I used RDTSC as well.
* running it several million times will just measure back-ground
noise from the OS, and STI/CLI may not work on every OS.
I'd recommend to run it three to four times and report the smallest
time interval.
Okay.
So here is my effort. The invocation and output are shown for 3 runs of
1 million loops each.
$ ./hstoq_icc aabbccddeeff0011
input: aabbccddeeff0011 passes: 1000000
rdtsc:
start = 40949977738136
end = 40950155398528
difference = 177660392
I see: ~177 per pass
$ ./hstoq_icc aabbccddeeff0011
input: aabbccddeeff0011 passes: 1000000
rdtsc:
start = 40955083892880
end = 40955285678376
difference = 201785496
now: ~201 may mean some more IRQs occured yet
$ ./hstoq_icc aabbccddeeff0011
input: aabbccddeeff0011 passes: 1000000
rdtsc:
start = 40956923434320
end = 40957101562616
difference = 178128296
Ok, ~178 is almost close to the first
I don't how many seconds those measurements come up to. Here is the same
command timed by the UNIX 'time' utility.
$ time ./hstoq_icc aabbccddeeff0011
input: aabbccddeeff0011 passes: 1000000
rdtsc:
start = 41221246973984
end = 41221414368856
difference = 167394872
real 0m0.107s
user 0m0.108s
sys 0m0.000s
I have shown only one run to preserve space.
As the output shows, the code takes about one-tenth of a second for a
million passes over the test string. Obviously, larger test strings
(and greater loops) take more time than smaller ones.
Real time is hard to compare on different machines/OS ...
[the code...]
It would be interesting to see it disassembled.
This FOR/WHILE/IF may end up with many branch instructions ?
If I compare only the source size yet,
C-coders seem to need much more keystrokes than ASM'ers ;)
Ok, I'll check my code speed after the addons again.
Done yet.
__
wolfgang
___________
The complete code follows. It compiles with the GNU C compiler or the
Intel C compiler. The compilation command is:
For gcc:
gcc -Wall -W -std=gnu99 hstoq.c
For Intel C:
icc -Wall -W -std=c99 hstoq.c
Usage is: PROGRAM HEXSTRING [ITERATIONS]
PROGRAM is executable file name.
HEXSTRING is the string to be converted.
ITERATIONS is the number of times to loop. This is optional. If left out
the default is 1,000,000 loops.
Code:
=====
#include <inttypes.h>
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <ctype.h>
#include <errno.h>
/* #define DBG 1 */ /* uncomment for debug output */
#define DEFAULT_LOOPS 1000000L
static inline uint_fast64_t hstoq(const char *restrict, int);
static inline uint64_t fx_rdtsc(void);
int main(int argc, char **argv) {
char *hexstr;
int nch;
uint_fast64_t compare, val;
uint64_t start, end;
register long loops = DEFAULT_LOOPS;
const char const *usage = "Usage: a.out STR [n]\nSTR - hex string;"
"must _not_ have prefix or suffix\nn - no. of times to convert."
" default is %ld\n\n";
if (argc < 2) goto err_exit;
else {
hexstr = argv[1];
nch = strlen(argv[1])-1;
#ifdef DBG
printf("argv[1] = %p\thexstr = %p\tnch = %d\nstrlen(argv[1])
= %d\n",
(void*)argv[1], (void*)hexstr, nch, strlen(argv[1]));
#endif
for (int ctr = nch; ctr >= 0; ctr--)
if (!isxdigit((int)argv[1][ctr])) goto err_exit;
if (argc > 2) {
errno = 0;
loops = strtol(argv[2], NULL, 0);
if (errno == ERANGE || loops <= 0) goto err_exit;
}
}
errno = 0;
compare = (uint_fast64_t)strtoull(argv[1], NULL, 16);
if (errno == ERANGE) goto err_exit;
printf("input: %s\tpasses: %ld\n", argv[1], loops);
start = fx_rdtsc();
while (loops >= 0) { val = hstoq(hexstr, nch); loops--; }
end = fx_rdtsc();
if (val != compare) {
printf("MISMATCH!\n\tstrtoull = %" PRIxFAST64
"\n\thstoq = %" PRIxFAST64 "\n", compare, val);
}
printf("rdtsc:\n\tstart = %" PRIu64 "\n\tend = %" PRIu64
"\ndifference = %" PRIu64 "\n", start, end, end - start);
return 0;
err_exit:
perror(NULL);
printf(usage, DEFAULT_LOOPS);
return EXIT_FAILURE;
}
static inline uint64_t fx_rdtsc(void) {
asm("rdtsc");
return;
}
static inline uint_fast64_t hstoq(const char *restrict hs, int nch) {
uint_fast64_t total = 0, currch;
uint_fast8_t hexch;
int shift = -4;
#ifdef DBG
printf("string = %s\nfrom hs = %s\n", hs-nch, hs);
#endif
while (nch >= 0) {
if (*(hs+nch) >= 0x61) hexch = *(hs+nch) - 0x57;
else if(*(hs+nch) >= 0x41) hexch = *(hs+nch) - 0x37;
else hexch = *(hs+nch) - 0x30;
currch = hexch;
currch <<= (shift += 4);
total += currch;
#ifdef DBG
printf("nch = %d\t*hs = %c\thexch = %c\ncurrch = %"
PRIxFAST64 "\ttotal = %" PRIxFAST64 "\nshift = %d\n\n",
nch, *hs, hexch, currch, total, shift);
#endif
nch--;
}
return total;
}
.
- References:
- Re: Is PSHUFW instruction MMX or SSE or SSE2? Is NASM manual correct?
- From: //\\\\o//\\\\annabee
- Re: Is PSHUFW instruction MMX or SSE or SSE2? Is NASM manual correct?
- From: santosh
- COMPARE HLL/ASM
- From: Wolfgang Kern
- Re: COMPARE HLL/ASM
- From: santosh
- Re: Is PSHUFW instruction MMX or SSE or SSE2? Is NASM manual correct?
- Prev by Date: Re: COMPARE HLL/ASM
- Next by Date: Re: COMPARE HLL/ASM
- Previous by thread: Re: COMPARE HLL/ASM
- Next by thread: Re: COMPARE HLL/ASM
- Index(es):
Relevant Pages
|