Re: Any libraries for vector mask and vector population count?
- From: glen herrmannsfeldt <gah@xxxxxxxxxxxxxxxx>
- Date: Mon, 07 Apr 2008 01:28:56 -0800
Dan wrote:
Yes, raw speed is needed. That's why we went from fortran to assembler
in the first place. Since the code was written, the amount of data has
increased by three, perhaps four, orders of magnitude.
I just looked over the standard's definition of integer, and it
implies all integers are signed. Is that true? If there are unsigned
integers, then there should be no problem.
They are signed, but you can use them with the bit manipulation
instructions, anyway. Funny things might happen on ones complement
machines, but that should be rare. I offer two population
count methods in Fortran that should be fairly fast.
The second seems to generate lots of moves to/from memory
on g95 without optimization, but keeps intermediates in
registers with -O2.
For g95 on IA32 the -O2 case it takes 29 instructions for the
second method, straight line code with no branches or
intermediate stores. For a vector population count it might
be possible to optimize it a little more to overlap more
of the additions. That might help on a machine with more
registers than IA32 to keep intermediate values.
do i=1,100000000
j=i
k=0
do while(j.ne.0)
j=iand(j,j-1)
k=k+1
end do
l1=iand(i,z'55555555')+ishft(iand(i,z'aaaaaaaa'),-1)
l2=iand(l1,z'33333333')+ishft(iand(l1,z'cccccccc'),-2)
l4=iand(l2,z'0f0f0f0f')+ishft(iand(l2,z'f0f0f0f0'),-4)
l8=iand(l4,z'0f0f0f0f')+ishft(iand(l4,z'f0f0f0f0'),-4)
l16=iand(l8,z'00ff00ff')+ishft(iand(l8,z'ff00ff00'),-8)
l32=iand(l16,z'0000ffff')+ishft(iand(l16,z'ffff0000'),-16)
m=l32
if(k.ne.m) write(*,*) i,k,m
enddo
end
-- glen
.
- References:
- Any libraries for vector mask and vector population count?
- From: Dan
- Re: Any libraries for vector mask and vector population count?
- From: Tim Prince
- Re: Any libraries for vector mask and vector population count?
- From: Tim Prince
- Re: Any libraries for vector mask and vector population count?
- From: FX
- Re: Any libraries for vector mask and vector population count?
- From: Dan
- Any libraries for vector mask and vector population count?
- Prev by Date: Re: FoX - an XML toolkit for Fortran
- Next by Date: glibc??
- Previous by thread: Re: Any libraries for vector mask and vector population count?
- Next by thread: Re: Any libraries for vector mask and vector population count?
- Index(es):
Relevant Pages
|