buckhorn

GTX TITAN 6GB June 1, 2013, 2:48 a.m.

twod
What I meant was that the 480/580 cards are much better at compute than the 600 series. For the new generation, Nvidia optimized all its 600 series for graphics, removing some compute features out in order to scale up the number of ALUs (1536 from 512; they had to make the ALUs less complex). When executing code with a lot of branches, the 600 series has a bit of a handicap compared to the 580/480. And compute workloads tend to have more branching than graphics workloads.

Interesting.
What compute features have been removed? And how are the ALU's less complex? If branches diverge within a thread group (warp), each path will be run serially on fermi or kepler architecture, and good gpu code avoids this situation whenever possible.

twod
The Titan is closer in architecture to the 580 than the 680, so many people (myself included) expected it to perform a lot better in compute than it does.

I'd argue that the Titan is much closer to the 680.
The main reason why people see reduced performance on kepler IMO is because most performant code for previous architectures reduces global memory accesses through shared memory (local memory in opencl speak), and on kepler the amount of shared memory didn't scale up with the increased core count per multiprocessor. Writing high performance code on kepler will often involve register swapping to compensate for less shared memory per core, which isn't available via opencl.

You can find plenty of examples of kepler cards outperforming fermi many times over.

About Me

Connect

Houdini Skills

Availability

Recent Forum Posts

GTX TITAN 6GB June 1, 2013, 2:48 a.m.