Open CL Memory Limits

Forums Technical Discussion Open CL Memory Limits

10319 12 2


sl0throp: Member; 258 posts; Joined:; Offline

April 30, 2015 4:59 p.m.

I am curious why I my card has 12 gigs but open cl says the max allocations is 3072 mb. Is there a way to change that? This is my print out from about houdini.

OpenCL Platform NVIDIA CUDA
Platform Vendor NVIDIA Corporation
Platform Version OpenCL 1.1 CUDA 7.0.29
OpenCL Device Quadro M6000
OpenCL Type GPU
Device Version OpenCL 1.1 CUDA
Frequency 1114 MHz
Compute Units 24
Device Address Bits 32
Global Memory 12288 MB
Max Allocation 3072 MB
Global Cache 384 KB
Max Constant Args 9
Max Constant Size 64 KB
Local Mem Size 47 KB
2D Image Support 32768x32768
3D Image Support 4096x4096x4096


anon_user_37409885: Member; 4189 posts; Joined: June 2012; Offline

April 30, 2015 5:47 p.m.

Two things; the addressable memory in your information is 32bit, which makes it a 4GB limit, and the latest drivers from Nivida are ‘64bit’ but have a bug where you can't address above 4GB.

We are waiting for Nvidia to fix the bug.

Note AMD or CPU openCL can already address the full 64bit address space.


sanostol: Member; 622 posts; Joined: Nov. 2005; Offline

May 1, 2015 4:12 a.m.

i really hope nvidia sees it as a bug, not a strategic decision to push their tesla cards


malexander: Staff; 5333 posts; Joined: July 2005; Offline

May 1, 2015 8:39 a.m.

The 350 driver introduced 64 bit addressing with OpenCL 1.2 support, but it's still failing to allocate more than 4GB. We've filed a bug with Nvidia on the issue. Getting closer, but not quite there yet.


Guillaume: Staff; 504 posts; Joined: April 2014; Offline

May 1, 2015 8:59 a.m.

I think the Max Allocation number describes the largest continuous memory block that can be allocated for a single resource.


malexander: Staff; 5333 posts; Joined: July 2005; Offline

May 1, 2015 9:03 a.m.

Yep, that's what Max Allocation is. However, the 350 driver still fails to allocate more than 4GB total, split over several buffers, and that's the bug we filed.


anon_user_37409885: Member; 4189 posts; Joined: June 2012; Offline

May 1, 2015 4:54 p.m.

sanostol
i really hope nvidia sees it as a bug, not a strategic decision to push their tesla cards

The Tesla would have only 32bit OpenCL too, but should have 64bit Cuda.


malexander: Staff; 5333 posts; Joined: July 2005; Offline

May 1, 2015 5:22 p.m.

sanostol
i really hope nvidia sees it as a bug, not a strategic decision to push their tesla cards

For this generation, the new Teslas are actually different GPUs than the Maxwell-based Quadros, Titans and GEForces. The new Teslas use a GPU based on the Kepler design found in the GEForce 780, while the Maxwell architecture that the new Quadro M and GEForce 900 series is based on is quite different than Kepler. So Nvidia's actually segmenting the markets by hardware now, not just software. There's good reason though, as Maxwell's FP64 capabilities are severely limited (1/32 FP32 rate) and a lot of Tesla users require the extra precision, so they had to keep FP64 running well in the Tesla.

But I agree, I certainly hope this is not an artificial limitation in the Maxwell-based Quadro and GEForces. Given that CUDA can manage 12GB of VRAM, it does seem like more of an OpenCL bug.


anon_user_37409885: Member; 4189 posts; Joined: June 2012; Offline

May 1, 2015 6:09 p.m.

As a side note to the SP to DP disparity; interesting talk at GDC, by Amber molecular dynamics, where they compute in single precision and accumulate in double precision IIRC i.e. comparing DP, DPFP, SPFP, SPXP etc

Video:
http://on-demand.gputechconf.com/gtc/2015/video/S5478.html [on-demand.gputechconf.com]

Slides:
http://on-demand.gputechconf.com/gtc/2015/presentation/S5226-Ross-Walker.pdf [on-demand.gputechconf.com]


sanostol: Member; 622 posts; Joined: Nov. 2005; Offline

May 2, 2015 4:46 a.m.

if this gets fixed a dream would come true

twod
For this generation, the new Teslas are actually different GPUs than the Maxwell-based Quadros, Titans and GEForces. The new Teslas use a GPU based on the Kepler design found in the GEForce 780, while the Maxwell architecture that the new Quadro M and GEForce 900 series is based on is quite different than Kepler. So Nvidia's actually segmenting the markets by hardware now, not just software. There's good reason though, as Maxwell's FP64 capabilities are severely limited (1/32 FP32 rate) and a lot of Tesla users require the extra precision, so they had to keep FP64 running well in the Tesla.

But I agree, I certainly hope this is not an artificial limitation in the Maxwell-based Quadro and GEForces. Given that CUDA can manage 12GB of VRAM, it does seem like more of an OpenCL bug.


johner: Staff; 827 posts; Joined: July 2006; Offline

May 2, 2015 10:41 a.m.

MartybNz
As a side note to the SP to DP disparity; interesting talk at GDC, by Amber molecular dynamics, where they compute in single precision and accumulate in double precision IIRC i.e. comparing DP, DPFP, SPFP, SPXP etc

FWIW we do the same thing. Most of the internal multigrid computations are single-precision, but if we're looking at total error to determine whether we can stop iterating, we use double precision for accumulation, dot product totals, etc.


anon_user_37409885: Member; 4189 posts; Joined: June 2012; Offline

May 2, 2015 5:07 p.m.

Great to know johner. It's crazy when you hear for the first time how bad the DP has got on these cards, and, awesome that we've got people working around it!


johner: Staff; 827 posts; Joined: July 2006; Offline

May 21, 2015 11:29 a.m.

Just wanted to point out that the new Nvida 352.09 beta drivers seem to fix the 4GB OpenCL limitation! I got them for Linux here:

http://www.nvidia.com/download/driverResults.aspx/85057/en-us [nvidia.com]

I don't know the status under Windows, I'm afraid.

I ran a 200M voxel smoke sim last night that used 11GB in a K6000 and solved in under 4 seconds / frame.

We'd be very curious to hear experiences successful or otherwise if anyone has a chance to try these.

Quick Links

                    
                        Search links
                        Show recent posts
                        Show unanswered posts