Re: x86 instructions guide
- From: aku ankka <jukka@xxxxxxxxxxxx>
- Date: Mon, 17 Sep 2007 20:02:21 -0000
On Sep 17, 4:00 pm, Robert Redelmeier <red...@xxxxxxxxxxxxxxx> wrote:
Fetching of vertex attributes is one place where the driver
can pack data more tightly, eg. create more optimal layout
so the .w slot of vec3 can be used to hold a scalar value,
for example. This kind of pre- shuffling is not very cool
when the data comes from client side arrays, when it is
in VBO the driver can do the optimization once and use the
data many times, which is good.
It sounds like some specialized GPU caches and prefetch
algorithms might be useful.
There is already caches in many levels of hierarchy. There is also
vertex cache; if you draw indexed primitives the recently uses entries
can be looked up, eg. is's a simple index-to-cache-entry mapping,
cheap and effective.
The thing is, when you get a cache miss now it is interesting where
the data resides and how it is organized.
If it is vertex array, one thing that is interesting is how the client
application has stored the data; is it "packed", this means that the
arrays where fetching can be combined have the same stride, and that
the data elements are within some specific memory range that is
efficient to fetch. Lots of assumptions: the stride is best to be
power of two and so on.
This can be overcome with recommended way to use the API; store the
data in either VBO or DL. In fact, it is advantageous to make a copy
of the data even if it resides in client side memory as array, the
reason is childishly obvious;
--> The driver cannot return from the draw primitive command until it
has read all the data from the client side array, because, the client
could otherwise modify the data while it is pending reading by the
graphics processor; not good. There are two obvious solutions:
A) don't return from the function until the data is read
B) read the data immediately to staging area where it can be
transfered to the graphics processor's local memory at appropriate
time
(A) is bad, it stalls the client application. The (B) approach is
generally prefered, so, if that is done, the data layout can be
optimized while doing the copying, no big deal, but there is still
copying taking place AND transfer over the bus to the graphics
processor's local memory so that the data is accessible. Not good.
To overcome this bandwidth hogging, it is prefered if the application
can store static data elements in advance so that the driver can
manage where the data resides at any given time. The prefered
mechanism is VBO. DL is also alright, it can even be slightly more
efficient but it is phased out; OpenGL ES for example has abandoned DL
completely in favour of VBO.
The OpenGL 3.0 does go further and generalize the memory object
creation and binding under same unified interface, currently OGL has
two or three different machanism for different types of memory
objects, the model is a pain to support in a driver. The new approach
is attractive: it is more efficient, clean and forward-looking. The
specification should be out for general population in Q4/07 if I
remember correctly. :)
In short: fancy prefetch algorithms won't do squat if the system is
engineered wrong. The best thing to do is to let the driver manage the
memory so that it can optimize the layout and location of the data for
efficient fetching. The companies in Khronos working on the new
specification have finally decided to ditch the dead weight of 1980's
legacy paradigms. The OpenGL state machine, ironically, was engineered
so that actual OpenGL API calls mapped nearly directly into that era's
Silicon Graphics workstation OpenGL-on-chip solutions.
It was, and still is a decent design as it's still in use and widely
accepted standard, but the industry is moving on.. there is no SGI as
we used to know it, and 3DLabs, who did major engineering effort for
OpenGL 2.0 are also kind of yesterday's news.. but the OpenGL is still
around and that's a pretty cool achievement. :)
.
- Follow-Ups:
- Re: x86 instructions guide
- From: Wolfgang Kern
- Re: x86 instructions guide
- References:
- x86 instructions guide
- From: useful_infos
- Re: x86 instructions guide
- From: Wolfgang Kern
- Re: x86 instructions guide
- From: aku ankka
- Re: x86 instructions guide
- From: Wolfgang Kern
- Re: x86 instructions guide
- From: aku ankka
- Re: x86 instructions guide
- From: Robert Redelmeier
- Re: x86 instructions guide
- From: aku ankka
- Re: x86 instructions guide
- From: Robert Redelmeier
- Re: x86 instructions guide
- From: aku ankka
- Re: x86 instructions guide
- From: Robert Redelmeier
- x86 instructions guide
- Prev by Date: Microsoft to pay EU fine of 497 million Euro
- Next by Date: "HAY" Wannabee, what time is it?
- Previous by thread: Re: x86 instructions guide
- Next by thread: Re: x86 instructions guide
- Index(es):
Relevant Pages
|
|