NVIDIA GF100 Architecture Details

Fermi GT100 - Architecture overview

[youtube j9F3W-v6PNI]

GT100 tesselation demo

After the first global overview in September 2009, NVIDIA has released new details on its new Fermi GT100 architecture. Here is a summary of NVIDIA’s GT100 architecture features in equations:

  • The CUDA core is the primary working unit of the GF100 (Each CUDA core is fully IEEE 754-2008 compliant) – GF100 = 512 CUDA cores
  • Streaming Multiprocessor (SM) = 32 CUDA cores – GF100 = 16 SM
  • 4 SFU per SM (SFU – Special Function Unit – executes transcendental instructions such as sin, cosine, …) – GF100 = 64 SFUs
  • Graphics Processing Cluster (GPC) = 4 SM – GF100 = 4 GPC
  • 1 Raster Engine per GPC (raster engine = rasterization, z-culling). A raster engine processes 8 pixels per clock – GF100 = 32 pixels per clock
  • 1 PolyMorph Engine per SM (PolyMorph Engine: execution unit that handles geometry for GF100: vertex fetch, tessellation, viewport transform, attribute setup, and stream output) – GF100 = 16 PolyMorph Engines
  • 4 Texture Units per SM – GF100 = 48 Texture Units
  • 6 partitions of 8 ROPs (ROPs perform blending or AA) – GF100 = 48 ROPs

Fermi GT100 - Die
GT100 die

Fermi GT100 - SM (Streaming Multiprocessor) detail
SM (Streaming Multiprocessor) detail

Fermi GT100 - CUDA core detail
CUDA core detail

The tessellator of the PolyMorph Engine is THE BIG FEATURE of GT100:

[youtube K3m9rPltA_s]


4 thoughts on “NVIDIA GF100 Architecture Details”

  1. Leith

    Sounds like you might need to add tesselation to your furmark program to give those ploymorphs a run for their money…

  2. JeGX Post Author

    Yep I’m thinking about that! I’m adding a new feature in FurMark (not yet tesselation)related to tesselation…

  3. Korvin77

    I’m so upset cause ATI had tesselator long ago but this was just dead piece of die

  4. Pingback: SHP (smoothed particle hydrodynamics ) « O BALBIO 3D

Comments are closed.