Re: Optimization help please
- From: "David J. Craig" <spamtrap@xxxxxxxxxx>
- Date: Fri, 16 Sep 2005 03:08:56 +0000 (UTC)
I do suspect that if you are reading the file using multiple buffers and
threads you can get much better performance by using overlapped IO. That
will permit one buffer to be CRC'd while the next buffer is being filled.
Multiple threads would permit those with the new dual core or SMP systems to
more fully utilize the available resources.
The VC 6 compiler is not as efficient as the current version. Also Intel's
compiler is the best I have heard about. I am saying that there are very
few people that can improve such a routine using assembly. I also am saying
that with the variations in current processors, any assembler code will have
to contain compromises that will decrease speed when one of the other
processors are encountered. It would be possible to write a special routine
for each processor and use a function pointer to call the appropriate one
based upon the cpu.
Have the compiler output assembler code for that routine and see if any
inefficiencies can be found. You may find multiple tests that can be done
once. You may find stores of the data to memory with each loop cycle to be
unnecessary and inefficient. You might declare the CRC variable with the
register keyword. Very short routines have very little opportunity for
speed increases. Using some of the new instructions, SSE, MMX, etc. may
permit more bytes to be processed in each loop cycle.
Some more information about how much of these ideas are already present
would be helpful. Just saying you need a rather simple routine optimized
would cause most of us to ask a lot of questions. You said you needed 'such
an algorithm' which means that what you really need is just a way to
determine if a block of data or a file has changed.
Just to make sure you have already eliminated the obvious, the IO will cost
far more cycles than the CRC routine. Depending upon the storage media it
may be necessary to use several threads with all of them requesting data to
be read then the CRC can be computed on each 'block' and the algorithm could
then combine them to allow the system to run as fast as it possibly can.
You did say that the IO has been optimized, so some of the above may be
implemented.
"Floptimize" <spamtrap@xxxxxxxxxx> wrote in message
news:1126800211.784160.313790@xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
> Thanks for taking time to reply to my post, however I find it
> rhetorical and presumptuous. Please don't take offence to my reply.
>
> Of course I understand the algorithm, and of course I know its not C++,
> and of course we have a need for such an algorithm in our design. The
> need to translate it into assembly is NOT just for fun.
>
> If you are disappointed in my post because I am asking for assistance,
> say it directly.
>
> I respect your valid opinion about the efficiency of compilers, and
> target processors. Our application is not designed for specific
> processors, just Windows 2000 and up, on x86 architecture.
>
> Our applications transmit video and audio files over a 100mb network.
> The files can be very large, and we do not have absolute control of the
> files at either end. We use CRC32 checksums to make logical decisions
> about whether or not to send the files over the network. We also use
> other indicators such as timestamps and file size. Our applications
> cache the checksums for the lifetime of the process, so when
> referencing the file for first time we must re-compute the checksum of
> the file. Reading the file in its entirety from disk and computing
> checksum is cpu and disk intensive, as you well know. We have done as
> well as we can to optimize the disk read performance which is the
> biggest bottleneck and now we want improve the performance of the
> checksum algorithm. Maybe there is another way to achieve our goals
> that we are unaware of....
>
> My post is prompted by an article I read about benchmarks of checksum
> algorithms written in assembly versus C. The algorithm I read about is
> not standard CRC32, and I am not experienced enough in intel x86
> assembly language to convert my routine from C.
>
> Another reason I asked for help is that, as I am a passionate C++
> developer, there are passionate assembly developers who enjoy helping
> others. I was hoping for help from someone like myself.
>
> There you have it.
>
.
- References:
- Optimization help please
- From: Floptimize
- Re: Optimization help please
- From: David J. Craig
- Re: Optimization help please
- From: Floptimize
- Optimization help please
- Prev by Date: Memory Alignment Question
- Next by Date: Can a C compiler compete with assembler ?
- Previous by thread: Re: Optimization help please
- Next by thread: Re: Optimization help please
- Index(es):
Relevant Pages
|