What happens if XPU scene exceeds VRAM?

   797   3   1
User Avatar
Member
9 posts
Joined: Feb. 2023
Offline
Hi,
Coming from Octane, I wondered how XPU as a hybrid renderer works if memory required for the scene exceeds the VRAM of the GPU? Does it just start using CPU more and RAM more?
User Avatar
Staff
469 posts
Joined: May 2019
Offline
XPU loads on demand many memory-heavy things, for example
- shader textures (on a per-tile basis)
- geometry primvars (normals, uvs, etc...)
- volumes
- etc...

Demand-loading means that it's only loaded into GPU memory if needed mid-render. So (eg) a particle system can have a lot of primvars but only the ones needed for rendering will be loaded into GPU memory.

For the stuff that it can demand-load, once the GPU memory is full, it will...
- stall rendering
- evict existing demand-loaded data
- continue rendering again
This means the memory profile will look like a saw-tooth as rendering progresses.
More details here
https://www.sidefx.com/docs/houdini/solaris/karma_xpu.html#textures [www.sidefx.com]

But it cannot demand-load everything, for example
- geometry + BVH data
- light textures
- etc...
These things are loaded into GPU memory before rendering begins. This means that its possible for a GPU device to still fail with "out of memory" even with the existence of the demand-loading/out-of-core system.
User Avatar
Member
9 posts
Joined: Jan. 2017
Offline
CUDA's default behavior in newer drivers is to allow the drivers to manage spilling data that hasn't been used in a long time into system memory at the cost of huge slowdowns if all data is in frequent use, but they'll never OOM unless you're low on system memory too. This option can be turned off in the control panel. You'd need to experiment with it in a situation that normally OOMs to determine the performance impact (which is why I recommend turning it off, otherwise you'll never get the OOM that's telling you tuning something in the scene / whatever might be the better option). You can always switch it back on and see how slow it gets at the point it OOM'd before and make a decision based on that.
User Avatar
Staff
469 posts
Joined: May 2019
Offline
GnomeToys
CUDA's default behavior in newer drivers is to allow the drivers to manage spilling data that hasn't been used in a long time into system memory at the cost of huge slowdowns if all data is in frequent use, but they'll never OOM unless you're low on system memory too. This option can be turned off in the control panel

I've found it performant, but I have a laptop with integrated GPU.

We're waiting on some more information from NVidia regarding this feature (eg it seems to not work with multiple GPUs). Once we have that we'll make a more formal post outlining its behavior and what that means for XPU users.
Edited by brians - Feb. 1, 2024 03:59:31
  • Quick Links