Re: Question for Herbert
- From: "wolfgang kern" <nowhere@xxxxxxxxxxx>
- Date: Fri, 25 May 2007 04:37:03 +0200
Wannabee skrev:
[about the prefetch...]
The prefetch itself takes its time ...
Only if it is needed.[...]
Yes, sure.
Worst case are RD-modify-WR instructions like XOR
(for semi-transparent mouse cursors and my old clock arrows).
Regardless how I configured caching or tried to prefetch it,
V-RAM reads are always much slower than writes.
So I now save the background and redraw it (within the V-RAM)
instead of XORing the image twice.
...cycles were measured on single 512 px lines [500:66/CPU:AGP].( Why did you not say "cs", "cy" for cycles :) )
It would interfiere with 'code-seg' and 'carry' ? :)
but 'px' is a more common abbreviations like 'mm'.
I guess 500:66 means 500 mhz, 66 mhz chipset, AGP.
The amd here runs at 1100mhz, 266mhz fsb, and has a 200mhz memoryclock.
PIII is 500mhz, 66, agp, just like above. So how can it appear faster?
It _isnt_. (confirmed by horizontal linedrawing and bitblt, and just about
every program I run on it) - but, it has some advantage for this kind of
code. I am guessing, but the codetests seems to support my claim.
It may also depend on the access type, my measured count is on
direct V-ram writes (no caching nor paging and all in ring0).
You once got an old KESYS demo pack and IIRC it should include
'kespeed.exe' which show the clock-cycles per dot for various
drawn components (lines,circels,cakelines and text).
It needs at least a VBE2.0 card with flat 1024x768,8 or 1152x864,8.
Also the 'circles.exe' reports line and circle speed info.
As for Herberts code this is ?????????????
Depends on the machine ?
[]
Yes, and it's hard to compare different machines by cyclecount
without having the BUS-speed ratios in the calculation.
Shouldnt the AMD be faster?
assuming the timings are correctly performed what would you say ?
V-RAM access speed is not limited by the CPU but by the memory
control chips and the BUS speed.
A CPU is usually 5 to 20 times faster than the busses.
I know one board with a 2.7 GHz Celeron and a 66 MHz BUS,
guess how this performs ....
Yes, I use the API bitblt.
Looks like windoze uses this hidden functions to become faster
ok. Can you explain how to use this "hidden" function?
I have some info for old Cirrus and S3-cards, from my new
ATI-cards I know only by coincidence that the screen refresh rate
can be alterd with a single byte at BAR0+0C (ie: 04D=60HZ 080=100Hz)
The whole BAR0 image is 16 KB large and somewhere in there must
be the command register and the parameters ...
I haven't analysed its windoze drivers yet,
because this is a very boring task.
__
wolfgang
.
- References:
- Question for Herbert
- From: //\\\\o//\\\\annabee
- Re: Question for Herbert
- From: Herbert Kleebauer
- Re: Question for Herbert
- From: //\\\\o//\\\\annabee
- Re: Question for Herbert
- From: Herbert Kleebauer
- Re: Question for Herbert
- From: //\\\\o//\\\\annabee
- Re: Question for Herbert
- From: //\\\\o//\\\\annabee
- Re: Question for Herbert
- From: wolfgang kern
- Re: Question for Herbert
- From: //\\\\o//\\\\annabee
- Question for Herbert
- Prev by Date: Re: NASM manual error?
- Next by Date: Re: Slow this puppy down
- Previous by thread: Re: Question for Herbert
- Next by thread: Re: Question for Herbert
- Index(es):
Relevant Pages
|