Hey guys,
So I almost bought my workstation yesterday but the guy at cc said if I'd wait for a month or so then I can probably buy a GTX 1080 which is not only cheaper than Titan X but also a lot faster GPU.
I will be using this workstation mainly for pyro, fluid, rbd simulations and rendering. Apart from this, some unreal 4 vr stuff.
When I compared the specs, Titan X seems better vram wise as it has 12gigs where as gtx 1080 will have 8gigs. But gtx 1080's memory speed is 10gbps and titans is 7gbps. On the other hand Titan's cuda cores are 3072 and 1080's are 2560!!!
To me, the new gtx 1080 seems beneficial more to gamers rather than what I want to do with it but still its almost half the price.. i'de appreciate if I can get some advice from more experienced users.
So with these differences, please suggest whether should I wait for gtx 1080 or get the Titan X and which will be better for my purposes.
other components of my rig:
processor: 8 core i7-5960 extreme edition
ram: 64 gigs, DDR4 corsair vengeance
Titan X specs:
http://www.geforce.com/hardware/desktop-gpus/geforce-gtx-titan-x/specifications [geforce.com]
GTX 1080 specs:
http://www.geforce.com/hardware/10series/geforce-gtx-1080 [geforce.com]
Thanks guys,
ak
TITAN X vs GTX 1080 for simuations
9629 7 3- Amit Khanna
- Member
- 14 posts
- Joined: Oct. 2015
- Offline
- anon_user_37409885
- Member
- 4189 posts
- Joined: June 2012
- Offline
Titan X is great right now. It's got more ram and is only a bit slower than what the 1080 is projected to achieve, keep an eye out for upcoming benchmarks.
If I'm reading these new cards right all Pascal chips should have the potential of using shared memory, in the future, and will have lower heat/power requirements, but, until the Ti/Titan versions are released, TitanX are probably better for sims.
Overall a lot of sims are done on the CPU though, as GPU performance benefits can drop off dramatically when interacting with non-opencl parts of the sim and sops, and, when there is not enough vRam.
You may be able to pick up cheaper second-hand TitanX as gpuphiles swap out for the 1080.
Edit: Unified memory for the GP100(not sure if the GP104 in the 1080 will do the same):
https://devblogs.nvidia.com/parallelforall/cuda-8-features-revealed/ [devblogs.nvidia.com]
If I'm reading these new cards right all Pascal chips should have the potential of using shared memory, in the future, and will have lower heat/power requirements, but, until the Ti/Titan versions are released, TitanX are probably better for sims.
Overall a lot of sims are done on the CPU though, as GPU performance benefits can drop off dramatically when interacting with non-opencl parts of the sim and sops, and, when there is not enough vRam.
You may be able to pick up cheaper second-hand TitanX as gpuphiles swap out for the 1080.
Edit: Unified memory for the GP100(not sure if the GP104 in the 1080 will do the same):
Finally, on supporting platforms, memory allocated with the default OS allocator (e.g. ‘malloc’ or ‘new’) can be accessed from both GPU code and CPU code using the same pointer. On these systems, Unified Memory is the default: there is no need to use a special allocator or for the creation of a special managed memory pool. Moreover, GP100’s large virtual address space and page faulting capability enable applications to access the entire system virtual memory. This means that applications can oversubscribe the memory system: in other words they can allocate, access, and share arrays larger than the total physical capacity of the system, enabling out-of-core processing of very large datasets.
Certain operating system modifications are required to enable Unified Memory with the system allocator. NVIDIA is collaborating with Red Hat and working within the Linux community to enable this powerful functionality.
https://devblogs.nvidia.com/parallelforall/cuda-8-features-revealed/ [devblogs.nvidia.com]
- moferad
- Member
- 18 posts
- Joined: Feb. 2013
- Offline
SLI 1080tis (when announced, more vram) , combined with some DX12 Memory stacking, im not sure if this is something that will be utilized, but thats my end goal, july ish for a potential TI release, or just wait for whatever the next revision of the Titan -Pascal equivalent is.
Im bummed we didnt get HBM2 in the cards tho..
Im bummed we didnt get HBM2 in the cards tho..
- Mary60
- Member
- 1 posts
- Joined: May 2016
- Offline
- Amit Khanna
- Member
- 14 posts
- Joined: Oct. 2015
- Offline
- Farmfield
- Member
- 65 posts
- Joined: Sept. 2014
- Offline
ArtyeAnyone know how the unified memory works in regard to OpenCL? I haven't run any big sims on my GTX 1070 yet, but on my GTX 970 I often ran out of VRAM and had to uncheck OpenCL, I don't know if/at what point I will run into that with the 8 Gb o the GTX 1070 - but if the driver automatically access system RAM for overflow , that would be a non issue.
If I'm reading these new cards right all Pascal chips should have the potential of using shared memory, in the future, and will have lower heat/power requirements, but, until the Ti/Titan versions are released, TitanX are probably better for sims.
Overall a lot of sims are done on the CPU though, as GPU performance benefits can drop off dramatically when interacting with non-opencl parts of the sim and sops, and, when there is not enough vRam.
You may be able to pick up cheaper second-hand TitanX as gpuphiles swap out for the 1080.
Edit: Unified memory for the GP100(not sure if the GP104 in the 1080 will do the same):Finally, on supporting platforms, memory allocated with the default OS allocator (e.g. ‘malloc’ or ‘new’) can be accessed from both GPU code and CPU code using the same pointer. On these systems, Unified Memory is the default: there is no need to use a special allocator or for the creation of a special managed memory pool. Moreover, GP100’s large virtual address space and page faulting capability enable applications to access the entire system virtual memory. This means that applications can oversubscribe the memory system: in other words they can allocate, access, and share arrays larger than the total physical capacity of the system, enabling out-of-core processing of very large datasets.
Certain operating system modifications are required to enable Unified Memory with the system allocator. NVIDIA is collaborating with Red Hat and working within the Linux community to enable this powerful functionality.
https://devblogs.nvidia.com/parallelforall/cuda-8-features-revealed/ [devblogs.nvidia.com]
That being said, (well, asked) I would wildly guess it's just implemented in CUDA, not OpenCL.
Edited by Farmfield - Sept. 17, 2016 03:03:49
- anon_user_37409885
- Member
- 4189 posts
- Joined: June 2012
- Offline
Looks like OpenCL 2 supports it.
http://developer.amd.com/community/blog/2014/10/24/opencl-2-shared-virtual-memory/ [developer.amd.com]
“OpenCL 2.0 introduces two landmark features: shared virtual memory and device-side enqueue. Here, we cover shared virtual memory.
Think about how you would access memory in OpenCL 1.2. Since the host and OpenCL devices don’t share the same virtual address space, you must explicitly manage the host memory, the device memory, and communication between the host and devices. You can’t use a host-memory pointer on the OpenCL device.
OpenCL 2.0 removes this limitation: the host and OpenCL devices can share the same virtual address range, so you no longer need to copy buffers between devices. In other words, no keeping track of buffers and explicitly copying them across devices! Just use shared pointers.
OpenCL 2.0 differentiates between coarse- and fine-grain (buffer and system) mechanisms, which define the granularity at which SVM buffers are shared. Updates to coarse- or fine-grain SVM are visible to other devices at synchronization points:”
http://developer.amd.com/community/blog/2014/10/24/opencl-2-shared-virtual-memory/ [developer.amd.com]
“OpenCL 2.0 introduces two landmark features: shared virtual memory and device-side enqueue. Here, we cover shared virtual memory.
Think about how you would access memory in OpenCL 1.2. Since the host and OpenCL devices don’t share the same virtual address space, you must explicitly manage the host memory, the device memory, and communication between the host and devices. You can’t use a host-memory pointer on the OpenCL device.
OpenCL 2.0 removes this limitation: the host and OpenCL devices can share the same virtual address range, so you no longer need to copy buffers between devices. In other words, no keeping track of buffers and explicitly copying them across devices! Just use shared pointers.
OpenCL 2.0 differentiates between coarse- and fine-grain (buffer and system) mechanisms, which define the granularity at which SVM buffers are shared. Updates to coarse- or fine-grain SVM are visible to other devices at synchronization points:”
Edited by anon_user_37409885 - Sept. 17, 2016 20:19:08
- malexander
- Staff
- 5155 posts
- Joined: July 2005
- Offline
Pascal offers page faulting, which requires virtualized memory. It means that instead of having the entire buffer in VRAM, parts of it can be ‘paged out’ to main memory (much like main memory and swap/pagefile). If the CL kernel needs that paged out section, it ‘faults’ and swaps it into VRAM, possibly evicting some other memory chunk from VRAM. That should mean that sims that failed due to insufficient memory should succeed now, albeit somewhat slower because of swapping. Assuming your main memory size is large enough to hold the swapped portions, that is
-
- Quick Links