Since Felix implemented a different heuristics for texture allocation, I
decided to do some measurements on the r100 how fast agp texturing
Test conditions are as follows:
Celeron Tualatin 1000@1330 on bx-133 (note this has consequences for AGP
speed, AGP 1x will actually have transfer rate of "AGP 1.33x", AGP 2x is
"AGP 2.66x"), 1.06GB/s main memory bandwidth. Graphic card is a Radeon
7200 SDR (@160Mhz, memory bandwidth is a paltry 2.05GB/s) 32MB.
Desktop resolution is 1152x864, local memory available for textures is
GART texture size was always 3MB less than GARTsize (32MB gart size
unless specifically mentioned). BIOS agp aperture size was 128MB, but I
could not test with a GART size of 64MB in xorg.conf (hard lockup when
starting X, without anything unusual in the logs as far as I could
tell). I highly doubt it would have made any difference in performance
I tested with only using the GART heap, with only local tex memory, and
with both. Note that some quick hacks to disable the local tex memory
were unsuccesful, with results ranging from chip lockups, hard lockups
to segfaults (the latter when I used a size of 0 for the local tex
size), so I just hacked the local tex size to be 65KB instead. GART heap
was disabled by using only 1 texture heap.
QuakeIII 1.32b, 800x600 windowed, timedemo demo four, with color tiling,
with texture tiling (that's another story, btw...), with hyperz, without
compressed textures, 32bit textures, trilinear. "best" means highest
texture quality, "2nd" means I used the second-highest texture quality
Some additional results to provide some information about the "in-use"
texture sizes of these quake3 benchmark runs:
AGP 2x, 16MB GART, GART only, best: 13 fps
AGP 2x, 16MB GART, GART only, 2nd: 67 fps
AGP 2x, 8MB GART, GART+local, best: 60 fps
AGP 2x, 8MB GART, GART+local, 2nd: 74 fps
And for reference:
AGP 2x, GART+local, 16bit textures (still with tex tiling), best: 77 fps
AGP 2x, GART+local, compressed textures, best: 85 fps
AGP 2x, GART+local, without texture tiling, best: 57fps (just a teaser
what what you can expect from that patch, the good news is that it now
actually seems to be fully working...)
All results were only "reasonably consistent", I got something like +- 2
fps, IIRC I got quite more reliable results in the past.
So, now the interesting part, interpretation of the results...
When not using gart texturing, AGP 1x vs. AGP 2x won't do a dime (that's
not exactly news, nothing to see here, move along...).
BUT, GART texturing performance definitely takes a big hit with AGP 1x
vs. 2x. With AGP 2x, overall performance is only around 15% slower than
with local memory however. Obviously, using the gart texture heap is
MUCH preferable to texture thrashing where the performance really tanks
completely (you can't see it from the numbers, but for instance those
33fps with only local memory, there are some sections of the benchmark
run where the framerate is constantly below 10 fps for several seconds,
OTOH some parts seem to run a bit faster than when you have both local
and gart textures).
As a consequence, I think it would be a really good idea to enable
faster AGP modes and larger GART sizes by default, especially on those
ultra-low-mem (16MB or even 8MB, though the latter probably hardly ever
get 3d acceleration at all) radeon mobilitys.
Also, r200 driver really should get gart textures too (in fact, with my
rv250, which only has 33MB or so available for textures, there are parts
in some rtcw maps where performance goes down to 5 fps or so, where now
the r100 thanks to its larger total available ram for texture maps still
manages 20 or so fps...).
btw texdown showed that texture transfers to card memory are faster than
to AGP memory, but not by very much (something like 100MB/s vs. 140MB/s
in the best case, though the numbers I got fluctuated quite a bit).