reinitialize optix device after device fail (karma xpu)

   1026   8   0
User Avatar
Member
41 posts
Joined: 8月 2017
Online
Is there some way to reinitialize/reset an optix device if it fails during xpu rendering so it contributes to rendering again or is the only way to restart houdini?
User Avatar
Member
7863 posts
Joined: 9月 2011
Offline
that would be neat. Not that I know of.
User Avatar
スタッフ
486 posts
Joined: 5月 2019
Offline
ronald_a
Is there some way to reinitialize/reset an optix device if it fails during xpu rendering so it contributes to rendering again or is the only way to restart houdini?

Try switching to KarmaCPU, then back to KarmaXPU
Edited by brians - 2024年1月11日 02:31:40
User Avatar
Member
41 posts
Joined: 8月 2017
Online
brians
ronald_a
Is there some way to reinitialize/reset an optix device if it fails during xpu rendering so it contributes to rendering again or is the only way to restart houdini?

Try switching to KarmaCPU, then back to KarmaXPU

I'm afraid that does not work. It would be really helpful if this could be added. Would make debugging problems with xpu a lot more comfortable and quicker.
Edited by ronald_a - 2024年1月11日 04:17:24
User Avatar
スタッフ
486 posts
Joined: 5月 2019
Offline
brians
Try switching to KarmaCPU, then back to KarmaXPU
ronald_a
I'm afraid that does not work.

That is strange, as switching between CPU and XPU destroys and recreates the XPU renderer.

What is the error you're getting from the device?
You'll be able to see when you enable the on-screen stats, as shown in this "how to" section of the xpu docs
https://www.sidefx.com/docs/houdini/solaris/karma_xpu.html#howto [www.sidefx.com]
User Avatar
Member
41 posts
Joined: 8月 2017
Online
in the renderstats it just says cuda error and then the renderstats for the gpus disappear. In the og Viewer it says: "KarmaXPU: device Type:Optix ID:0 has registered a critical error , so will now stop functioning. Future error messages will be suppressed"

I am on window10 and my driver version is 546.01 (with 2x RTX A4500).

I have a scene attached which creates the problem.

Attachments:
karmaXPU_reset_issue.hip (1.9 MB)

User Avatar
スタッフ
486 posts
Joined: 5月 2019
Offline
ronald_a
or is the only way to restart houdini?

Are you finding that restarting houdini actually fixes the issue?

ronald_a
in the renderstats it just says cuda error

Are you able to screenshot that for me? thanks

ronald_a
I am on window10 and my driver version is 546.01 (with 2x RTX A4500).

That driver is not within our recommended range.
https://www.sidefx.com/docs/houdini/solaris/karma_xpu.html#supported-hardware [www.sidefx.com]
https://www.sidefx.com/Support/system-requirements/ [www.sidefx.com]

Are you able to test with 546.33 (it'll be listed as a GeForce driver, but should be fine for testing)


thanks lots
User Avatar
Member
41 posts
Joined: 8月 2017
Online
Ok, so I did install the recommended driver which did not change anything but then remembered, that I had some Karma Optix bug fixing variables set (KARMA_XPU_OPTIX_FORCE_GAS_TRACE, KARMA_XPU_OPTIX_VALIDATION_MODE, KARMA_XPU_OPTIX_DC_STACK_BOOST) which seemed to cause the problem. I removed them and now the scene renders as expected.

It is now also true, that if XPU devices fail, switching to CPU and back to XPU restarts XPU and the XPU devices work again without restarting Houdini.
User Avatar
スタッフ
486 posts
Joined: 5月 2019
Offline
Great!
Glad we got to the bottom of it.
Thanks Ronald
  • Quick Links