Article Index
- 1 – Introduction
- 2 – Demo Pack Download
- 3 – Geometry Instancing Techniques
- 4 – Geometry Instancing Tests
- 5 – Quick Conclusion
2 – Demo Pack Download
You can grab the demo pack here:
[download#140#image]
You can move the camera with the mouse and move it with AWSD keys.
The skybox used in the demo comes from this page.
In short, geometry instancing makes it possible to render several instances of the same mesh at once. Geometry instancing techniques aim to minimize the number of draw calls (and / or to speed up them) required to render all instances. The ideal scenario is an unique render call for all instances, one render call to rule them all!
This GI demo pack includes 5 GI techniques. Each GI technique can be enabled by F2 to F6 keys. And to illustrate the GI techniques, the demo renders an asteroid belt with asteroids, lot of asteroids…
Asteroid belt and Geometry Instancing: up to 180 millions polygons rendered in real time – Click to zoom
The demo is available in 10 versions (sorry, I was too lazy to code a GUI – maybe in a real benchmark version, who knows…):
- 20,000 asteroids, 18 triangles per asteroid: 360,000 tri.
- 20,000 asteroids, 72 triangles per asteroid: 1,440,000 tri.
- 20,000 asteroids, 450 triangles per asteroid: 9,000,000 tri.
- 20,000 asteroids, 800 triangles per asteroid: 16,000,000 tri.
- 20,000 asteroids, 1800 triangles per asteroid: 36,000,000 tri.
and
- 100,000 asteroids, 18 triangles per asteroid: 1,800,000 tri.
- 100,000 asteroids, 72 triangles per asteroid: 7,200,000 tri.
- 100,000 asteroids, 450 triangles per asteroid: 45,000,000 tri.
- 100,000 asteroids, 800 triangles per asteroid: 80,000,000 tri.
- 100,000 asteroids, 1800 triangles per asteroid: 180,000,000 tri.
Pingback: OpenGL Geometry InstancingJeGX's Infamous Lab | JeGX's Infamous Lab
It would be interesting to see how the test behave with different instancing methods: instanced array, texture buffer, etc. Also 1800 triangles isn’t so much for the maximum.
Nice work anyway!
Thanks Mr Groove!
I’ll update this demopack with new techniques next time (at least with instanced array). And I’ll increase the number of polygons per instance 😉
FWIW, note that with more triangles, you’ll be hitting harder the triangle setup bottleneck of one triangle per clock cycle. Don’t know for the NVidia GTX 480, but this limit applies for the ATI R5xxx ; I don’t think that going beyond 180M triangles will bring you anything good for this generation of boards.
Nice summary of instanciation techniques and perfs though. Did you try to have finer grained timings with ARB_timer_query extension ?
Cheers
Yes keep these demo and tutorials coming. I enjoy reading them as it keeps me up to date on the newest features I can do with OpenGL. Plus it’s nice to see the old vs. new method of doing the same thing so once can make his own decision on what to do.
Thanks!!! keep up the good work!
20,000 instances x 18 tri/instance = 360,000 tri
Geforce GTX 470@ 700 core /1800 mem
– F2: FPS=18, GPU=20%, CPU=30%
– F3: FPS=40, GPU=21%, CPU=35%
– F4: FPS=68, GPU=15%, CPU=35%
– F5: FPS=96, GPU=19%, CPU=35%
– F6: FPS=101, GPU=24%, CPU=30%
100,000 instances, 1800 tri/instance = 180,000,000 tri
Geforce GTX 470@ 700 core /1800 mem
– F2: FPS=4, GPU=67%, CPU=30%
– F3: FPS=6, GPU=99%, CPU=24%
– F4: FPS=6, GPU=99%, CPU=14%
– F5: FPS=6, GPU=99%, CPU=12%
– F6: FPS=6, GPU=99%, CPU=10%
interesting…CPU usage goes down as GPU goes up..i thought the cpu would be less stressed with the lower geometry count…
I tested with a Radeon HD 2400, Catalyst 10.6.
The F6 technique half-failed: the asteroids are there, rotating, but all shading on them is turned off (all black). But the middle planet is shaded.
I suppose this was not intended.
The driver exposes all the 3 required extensions.
20,000 instances x 18 tri/instance = 360,000 tri
ATI HD4770 @ 940 core / 4800 mem
– F2: FPS=44, GPU=50%, CPU=28%
– F3: FPS=57, GPU=58%, CPU=30%
– F4: FPS=54, GPU=40%, CPU=30%
– F5: FPS=140, GPU=42%, CPU=34%
– F6: FPS=152, GPU=34%, CPU=30% (no shadinng)
100,000 instances, 1800 tri/instance = 180,000,000 tri
ATI HD4770 @ 940 core / 4800 mem
– F2: FPS=5, GPU=99%, CPU=18%
– F3: FPS=5, GPU=99%, CPU=16%
– F4: FPS=5, GPU=99%, CPU=16%
– F5: FPS=4-8, GPU=99%, CPU=12-28%
– F6: FPS=1-6, GPU=99%, CPU=6-28% (no shadinng)
Same thing as Matumbo here. But i have Radeon 4850 on Win7 and Cat. 10.5
F6 – asteroids are all black with all triangle variations (different EXEs).
Could you share your source code please (both C++ and GLSL), in order to learn advanced techniques and programming in OpenGL?
Omg, make a GRAPH, not text :p
Completely off topic but I think our friend JegX should make a post about this.
http://www.techreport.com/discussions.x/19216
hay JeGX R u died or what?????
Not publishing any new article from 3 to 4 days???
Fight with GirlFriend?????????
r u alive????????
where can I find the source of this demo ? or a similar source ? I get very bad performance with glDrawElements() (1000 cube instances max ) and glDrawElementsInstanced gives the same result
Yet I get a descent FPS when I run your demo (I’m on ATI 4850, opengl 2.1)
Yeah, the source for this would be really handy!
180 million eh? According to 3dmark 01, my gtx 470 does 400 million.
http://jooh.no/web/GeForce_8600_GT_vs_GTX_470_polygon_performance.png
Instancing as in the article however, does not use as much cpu power, but is also less useful than “real” cpu aware polygons.
the 470 fermi have several setup engines running in parallel instead of just one as all the previous gpus (ati and nvidia)
please delete my previous post
where do i get the source code?
Thanks!