SSE help please....
From: Asfand Yar Qazi (im_not_giving_it_here_at_i_hate_spam.com)
Date: 04/13/04
- Next message: Matt Taylor: "Re: Set a register to 0"
- Previous message: TS: "Re: Video programming"
- Next in thread: D. Zlo: "Re: SSE help please...."
- Reply: D. Zlo: "Re: SSE help please...."
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
Date: Tue, 13 Apr 2004 03:27:10 +0100
Hi,
I'm messing around with the built-in vector operations in GCC 3.3.3
(__builtin_ia32_addps, etc.) that generate SSE instructions. I'm not
experienced in these matters, so please forgive me if I say something silly.
I use the term 'slot' as follows: A 128-bit SSE/MMX register is made of
4 adjacent 32-bit _slots_
I was wondering if there was a way to do the following with SSE (SSE1 on
Pentium3, if anyone's wondering:)
Add all 4 32-bit floats in an SSE register and store them in a 32-bit
slot in some SSE register. (I'd like to implement matrix multiplication
like this.)
What is the difference between the movups and movaps instructions?
movaps deals with packed data, movups makes no assumption on alignment
(apparently). What does that mean? The GCC's builtin functions for
both instructions gave the same result.
Thanks for your patience,
Asfand Yar
-- http://www.it-is-truth.org/
- Next message: Matt Taylor: "Re: Set a register to 0"
- Previous message: TS: "Re: Video programming"
- Next in thread: D. Zlo: "Re: SSE help please...."
- Reply: D. Zlo: "Re: SSE help please...."
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
Relevant Pages
|