PC frequently crashes when left to render Karma scenes

   2997   7   2
User Avatar
Member
8 posts
Joined: March 2017
Offline
Edit: Probably solved! XMP being too aggressive seems to be the issue. I went into BIOS, disabled XMP, then set the RAM speed to be one option lower than the max for my CPU. Haven't had any crashes since.

---

I've been trying to solve this problem where my Windows 10 PC will crash seemingly at random when rendering Houdini projects with Karma XPU on both H19 and H19.5. This has occurred consistently across multiple projects (I haven't tested with karma CPU).

My PC will be rendering and manage to output several EXRs before suddenly completely freezing - can't move the mouse, keyboard doesn't respond (hitting caps lock for example doesn't light up the caps lock LED). I always come back to find the CPU maxed out at 100% usage (I can tell because the fans are outputting lots of heat and the UPS shows that the PC is eating the max expected amount of watts). The only way to recover is to hard reset the PC using the button on the case.

The frequency of crashing so far has been a crash after only a couple minutes of rendering but sometimes a crash won't occur until after an hour of rendering.

I've tried reducing how many CPU cores are used for rendering karma (by around one to even eight cores during testing) but that doesn't seem to make a difference.

RAM usage usually doesn't go above 30%. CPU AND GPU maxed out at 100% obviously since it's XPU rendering.

PC specs are:
CPU: TR 3990X
GPU: GTX 1080
RAM: Trident Z Neo 64 GB - set to 3600 MHz
MOBO: MSI Creator TRX40
SSD: 970 Evo Plus 1TB M.2
PSU: HX Platinum 750W
water cooling AIO
UPS - 900W

I know the hardware is compatible with each other, such as the RAM with the CPU (both the RAM hardware type and the 3600 MHz XMP config), since I actually read AMD's and G.Skill's documentation.

I am running the latest Nvidia Studio Driver, although the crashing has occurred while running previous versions too, so I don't think it's a driver issue.

Windows Event Viewer doesn't appear to show any related Error/Critical events other than the one critical event that is logged due to me hard resetting the computer.

Does anyone know what could be the cause of this issue? I can't seem to figure it out myself.
Edited by WhyIsHoudiniSoHardHelp - Sept. 10, 2022 20:40:25
User Avatar
Member
2536 posts
Joined: June 2008
Offline
Drop down a new Karma ROP, in case you changed something funky, and try rendering with a minimal amount of samples. Set traced samples to 3 or 9 and disable the denoiser. This might help determine if it is a heat problem, or a scene problem.

If you're not running a fan controller on your 1080, try one of the free ones for Windows. I'm using the MSI Afterburner. It allows you to draw a specific heat curve and dial in the maximum RPMs for the GPU fan.

You could try the nVidia game drivers instead of studio..?
Edited by Enivob - Aug. 1, 2022 16:13:15
Using Houdini Indie 20.0
Windows 11 64GB Ryzen 16 core.
nVidia 3050RTX 8BG RAM.
User Avatar
Member
710 posts
Joined: July 2005
Online
Definitely install Afterburner if you haven't already. IIRC the GPU fan won't reach its max rpm (~4,000 rpm) without it. I also wonder if your power supply is adequate for a 64 core processor + GPU.

Another thing you could try is alternately disabling your Optix (GPU) and Embree (CPU) devices to see if it's one or the other causing issues. Take a look at the Disabling devices section at the bottom: https://www.sidefx.com/docs/houdini/solaris/karma_xpu.html [www.sidefx.com]
User Avatar
Member
7767 posts
Joined: Sept. 2011
Online
Siavash Tehrani
Another thing you could try is alternately disabling your Optix (GPU) and Embree (CPU) devices to see if it's one or the other causing issues. Take a look at the Disabling devices section at the bottom: https://www.sidefx.com/docs/houdini/solaris/karma_xpu.html [www.sidefx.com]

this is the first thing I would try.

Disabling XMP would also be something to do, remember that XMP is overclocking. Not sure if it's applicable to threadripper, but disabling PBO should also be done to rule that out.
User Avatar
Staff
468 posts
Joined: May 2019
Offline
WhyIsHoudiniSoHardHelp
I've been trying to solve this problem where my Windows 10 PC will crash seemingly at random when rendering Houdini projects with Karma XPU. This has occurred consistently across multiple projects (I haven't tested with karma CPU)

What version of Houdini are you running?
H19.5 XPU has a new "sparse textures" feature where only tiles of a texture are loaded into GPU ram on-demand (ie only when needed). This reduces memory consumption by quite a bit. But it had a memory bug where it would not detect out-of-memory situations, potentially leading to a crash.

This memory bug has been fixed in 19.5.326 and I'd be interested to see if that solves your issue.

Alternatively, 19.5.329 has a new environment variable that allows the user to disable the "sparse textures" feature altogether.
"export KARMA_XPU_OPTIX_SPARSE_TEXTURES=0"
User Avatar
Member
8 posts
Joined: March 2017
Offline
brians
What version of Houdini are you running?

I've experienced PC crashing in both H19 and H19.5 so probably this isn't the issue. Thanks for the idea though!
User Avatar
Member
20 posts
Joined:
Offline
In some cases could help to disable embree device

Attachments:
2023-05-10_02-29.png (71.4 KB)

User Avatar
Member
8 posts
Joined: March 2017
Offline
@oslo

Thanks for the tip. In my case though the problem was XMP RAM profile was too aggressive. Hopefully you saw the edit I made in the original post up top.
  • Quick Links