Re: MASM Expert needed immediately



Betov wrote:
f0dder wrote:

Then, write an Assembler that beats RosAsm speed with SSE.

Umm, I must admit I fail to see how SSE would speed up an assembler... the
use of floating-point calculations there is pretty limited.

And, for your information, 100% of what i write works on the
older "Windows able PCs".

You don't deal with audio or graphics programming, do you?

Instead of saying stupidities, ask yourself the reasons why,
after so many years of development, Intel prefered to implement
the SSE joke, rather than making the Strings Intructions faster
(and why AMD did it).

I wouldn't mind seing decently implemented string instructions, LOOP, et
cetera. And I do think it's silly those instructions perform so abysmally.
But even so, getting a think like LOOP down to the same performance a
dec+jnz gives NOWHERE the same amount of speed boost as SSE (or MMX, for
that matter) can give you.

Of course, MMX and SSE aren't "general-purpose" like the ALU instructions,
but you could argue that x87 FPU should be dropped as well, since those
instructions can be performed with the ALU as well...


.



Relevant Pages

  • Re: fpu code optimisation request
    ... What was said about MMX using the FPU is ... >> instructions. ... Both SIMD SSE and MMX will have the SIMD advantage over x87/SISD SSE on ... SSE isn't going to be very helpful because Pentium-III ...
    (comp.lang.asm.x86)
  • Re: Fastest Code for byte-substitutions in a string?
    ... I am beginning to delve into MMX and SSE. ... so only algorithms which are highly parallel ... simply use general-purpose instructions. ...
    (comp.lang.asm.x86)
  • Re: Where do I start (over)?
    ... Availability of 64-bit wide general-purpose registers. ... the things to cross out are all the MMX instructions. ... one of them has an SSE counterpart, so there's little need to use MMX ...
    (comp.lang.asm.x86)
  • Re: Intel SSE sucks dogshit for 3D graphics
    ... > reasonable reference to try MMX or XMM code using the instructions you ... > which usually help. ... > a PIV for some time, there is a bit more info around to improve the ... And I'm speaking quite beyond SSE concerns. ...
    (comp.lang.asm.x86)
  • Re: Float/SSE optimization on Athlon/P4
    ... > SSE code I simply used scalar SSE instructions for the loop ... > a nasty surprise as speed dropped significantly on Athlon. ... > add esi, eax ...
    (comp.lang.asm.x86)