Karma XPU feels really slow!

   13517   72   10
User Avatar
Member
355 posts
Joined: Nov. 2015
Offline
Maybe it's just me, but Karma XPU has been quite disappointing so far in H20! I'm just running really simple scenes on a 3090 that would take seconds in redshift and it's taking 5mins here with low samples(512) so it's also really noisy. This certainly doesn't feel like a renderer for freelance use. Big disappointment for me! maybe it's the version of H20 (506) or maybe I just don't know what I'm doing. I'd love to hear from others what the experience has been like so far with XPU.
Edited by traileverse - Nov. 12, 2023 17:49:10
hou.f*ckatdskmaya().forever()
User Avatar
Staff
469 posts
Joined: May 2019
Offline
Thanks for letting us know.

Ideas:
  1. Is your GPU device even working? When in IPR mode, what devices are listed in the top right of the viewport? Do you see Optix up there somewhere?
  2. Is it running significantly slower than what is in H19.5?
  3. Is it possibly taking a while to compile the shaders for GPU? (ie, can you see a % in the top right of the viewport, or is it still showing "compile" or "init")
  4. What kind of scene are you rendering? Are you able to post an example here so we can take a look?

One thing to note, is that H20 XPU needs NVidia driver 535+ (545+ recommended). Its possibly that the Optix device is failing to initialize, but somehow the error has not been visible to you for whatever reason.

Cheers
User Avatar
Member
355 posts
Joined: Nov. 2015
Offline
brians
Thanks for letting us know.

Ideas:
  1. Is your GPU device even working? When in IPR mode, what devices are listed in the top right of the viewport? Do you see Optix up there somewhere?
  2. Is it running significantly slower than what is in H19.5?
  3. Is it possibly taking a while to compile the shaders for GPU? (ie, can you see a % in the top right of the viewport, or is it still showing "compile" or "init")
  4. What kind of scene are you rendering? Are you able to post an example here so we can take a look?

One thing to note, is that H20 XPU needs NVidia driver 535+ (545+ recommended). Its possibly that the Optix device is failing to initialize, but somehow the error has not been visible to you for whatever reason.

Cheers

Hi Brian, here is the a file that takes 5 mins on my machine (specs below). Rendering with similar settings with redshift(a much much cleaner in terms of noise) took a little over a minute.

1. The GPU seems to be working fine (hopefully it is lol)
2. Didn't really use XPU in H19.5


System Specs:
CPU - Ryzen 3950x
GPU - RTX 3090
RAM - 64GB
Windows 11
Nvidia Studio Driver Version: 546.01
Edited by traileverse - Nov. 12, 2023 20:10:27

Attachments:
karma_simple_test.hiplc (1.1 MB)

hou.f*ckatdskmaya().forever()
User Avatar
Member
7770 posts
Joined: Sept. 2011
Offline
traileverse
Hi Brian, here is the a file that takes 5 mins on my machine (specs below). Rendering with similar settings with redshift(a much much cleaner in terms of noise) took a little over a minute.

1. The GPU seems to be working fine (hopefully it is lol)
2. Didn't really use XPU in H19.5

This scene is a very bad scene. It's not one I would expect to be able to render without a lot of noise or using denoisers or some indirect acceleration like photon cache. It's white glossy box being indirectly light entirely from a blown out diffuse reflection, has shallow depth of field and refractive objects. A pure brute force renderer like Karma which as no optimizations (cheats) is going to be slow to render this.

A simple scene is one that is directly lit and doesn't have light bouncing around forever off white glossy walls. like a torus or teapot something.

Edit:
The scene doesn't appear to load correctly for me. It's using materials that are using textures I don't have.

is this what you see?
Edited by jsmack - Nov. 12, 2023 20:31:21

Attachments:
karma_simple_test.jpg (1.6 MB)

User Avatar
Member
355 posts
Joined: Nov. 2015
Offline
jsmack
traileverse
Hi Brian, here is the a file that takes 5 mins on my machine (specs below). Rendering with similar settings with redshift(a much much cleaner in terms of noise) took a little over a minute.

1. The GPU seems to be working fine (hopefully it is lol)
2. Didn't really use XPU in H19.5

This scene is a very bad scene. It's not one I would expect to be able to render without a lot of noise or using denoisers or some indirect acceleration like photon cache. It's white glossy box being indirectly light entirely from a blown out diffuse reflection, has shallow depth of field and refractive objects. A pure brute force renderer like Karma which as no optimizations (cheats) is going to be slow to render this.

A simple scene is one that is directly lit and doesn't have light bouncing around forever off white glossy walls. like a torus or teapot something.

Edit:
The scene doesn't appear to load correctly for me. It's using materials that are using textures I don't have.

is this what you see?

Yeah, I see what you're saying about the type of scene and the fact that it's a brute force renderer, I need to render indoor architecture type scenes, this is my bad stress test. I might have had unrealistic expectations about where XPU would be in comparison to redshift in terms of speed and not taking into account the fact that it is a brute force renderer. I still to learn more about using Karma but I tested with other simpler scenes and the speed just isn't what I expected. The bubble wrap scene in the content library takes nearly 30mins to render even when I turn off depth of field. I get that the transmissive material and all but is that economic? is a 4090 really twice as fast in a linear way to cut that down to 15mins?
hou.f*ckatdskmaya().forever()
User Avatar
Member
355 posts
Joined: Nov. 2015
Offline
jsmack
traileverse
Hi Brian, here is the a file that takes 5 mins on my machine (specs below). Rendering with similar settings with redshift(a much much cleaner in terms of noise) took a little over a minute.

1. The GPU seems to be working fine (hopefully it is lol)
2. Didn't really use XPU in H19.5

This scene is a very bad scene. It's not one I would expect to be able to render without a lot of noise or using denoisers or some indirect acceleration like photon cache. It's white glossy box being indirectly light entirely from a blown out diffuse reflection, has shallow depth of field and refractive objects. A pure brute force renderer like Karma which as no optimizations (cheats) is going to be slow to render this.

A simple scene is one that is directly lit and doesn't have light bouncing around forever off white glossy walls. like a torus or teapot something.

Edit:
The scene doesn't appear to load correctly for me. It's using materials that are using textures I don't have.

is this what you see?

yes, that's the image, it's just using materials from the catalogue, what was your render times on it?
hou.f*ckatdskmaya().forever()
User Avatar
Member
7770 posts
Joined: Sept. 2011
Offline
traileverse
yes, that's the image, it's just using materials from the catalogue, what was your render times on it?

It took 4:30 with the settings from the scene, 512 samples.

traileverse
The bubble wrap scene in the content library takes nearly 30mins to render even when I turn off depth of field. I get that the transmissive material and all but is that economic? is a 4090 really twice as fast in a linear way to cut that down to 15mins?

I was also surprised the bubblewrap scene took 30 minutes/1024 samples, but glossy refractions are always a difficult case. I made an optimized version of the material that doesn't use refraction and instead uses alpha transparency. It looks nearly the same and renders in about 10 minutes at 1024 samples. It doesn't have quite the 'forward scatter' translucency that the real one does though since it doesn't do refraction at all.
User Avatar
Member
355 posts
Joined: Nov. 2015
Offline
jsmack
traileverse
yes, that's the image, it's just using materials from the catalogue, what was your render times on it?

It took 4:30 with the settings from the scene, 512 samples.

traileverse
The bubble wrap scene in the content library takes nearly 30mins to render even when I turn off depth of field. I get that the transmissive material and all but is that economic? is a 4090 really twice as fast in a linear way to cut that down to 15mins?

I was also surprised the bubblewrap scene took 30 minutes/1024 samples, but glossy refractions are always a difficult case. I made an optimized version of the material that doesn't use refraction and instead uses alpha transparency. It looks nearly the same and renders in about 10 minutes at 1024 samples. It doesn't have quite the 'forward scatter' translucency that the real one does though since it doesn't do refraction at all.

Could you share that optimized version? Also, are you on a 3090 as well?
hou.f*ckatdskmaya().forever()
User Avatar
Member
201 posts
Joined: Jan. 2013
Offline
Hi, I decided to do a render test of the content library examples too. I got the following render times from the bubble wrap example.

Default settings from the scene:

XPU: 11:15 min.
Path traced samples: 1024
Contribution of each device to the render:
Nvidia GTX 3070 Ti: 49%
Embree CPU Device: 51%

CPU: 23:25 min.
Primary samples: 128

GPU only (KARMA_XPU_DISABLE_EMBREE_DEVICE=1): 21:25 min.
Path traced samples: 1024

My workstation configuration:

Houdini: 20.0.506 Production
OS: Manjaro 23.1.0 Vulcan
Kernel: x86_64 Linux 6.1.62-1-MANJARO
CPU: AMD Ryzen 9 5950X
GPU: NVIDIA GeForce RTX 3070 Ti
Driver Version: 535.129.03
User Avatar
Member
33 posts
Joined: Feb. 2016
Offline
You're not the only one, I am also running render tests as we speak, and to be honest I am disappointed by render speed and interactivity. I have 2x4090/Amd 5950X and the render speed as well as the interactivity when dialing in numbers on the materials are not up to par with modern render engines. I guess Karma isn't meant for freelancers but more for bigger pipelines, where render speed doesn't count as much.
https://behance.net/derya [behance.net]
User Avatar
Staff
34 posts
Joined: May 2022
Offline
There Render Geometry Settings node has "Is Portal" set but there's no associated portal object(the opening object,box2, in sopcreate1 is subtracted from box1 to create a one room object with an opening). Please find attached .hip file that has a portal object added. This should converge quicker in terms of noise.

Also, I the main intensity parameter on Physical Sky was set to 50. This affects both sun & sky intensity. The visible patch of sunlight on the side wall acts an indirect source of light with a high intensity. This also contributes to noise level specially as there are refractive objects in the scene too. In the attached file I left the main intensity as 1 and set the Sky intensity (in sky tab) to 50 instead and Sun intensity to a lower value 7.

Attachments:
karma_simple_test_portal.hiplc (1.1 MB)

User Avatar
Staff
34 posts
Joined: May 2022
Offline
Setting the denoiser also helps to remove the extra noise left in the render.
User Avatar
Member
33 posts
Joined: Feb. 2016
Offline
Here's a comparison between a Redshift Render and a Karma Render, I tried to have both the same dif/refl/refr bounces and the same material, but the renders look a bit different. As if the Karma Render needed more refr bounces.
Redshift at 1m25s
Karma at 7m24s

First one is Redshift, second one Karma.
Edited by Yader - Nov. 13, 2023 06:30:14

Attachments:
Redshift_Material_Look_Development_v001_Glass_Blurry_01_Default.jpg (269.6 KB)
Redshift_Material_Look_Development_v001_Glass_Blurry_Karma.jpg (508.2 KB)

https://behance.net/derya [behance.net]
User Avatar
Member
355 posts
Joined: Nov. 2015
Offline
Yader
Here's a comparison between a Redshift Render and a Karma Render, I tried to have both the same dif/refl/refr bounces and the same material, but the renders look a bit different. As if the Karma Render needed more refr bounces.
Redshift at 1m25s
Karma at 7m24s

First one is Redshift, second one Karma.

Yeah, I have my issues with redshift but economically wise there is no way I can use XPU at the moment with those times. I never used octane but isn’t it also unbiased? I last tested arnold GPU a year ago and it was in better shape than this. Karma whether XPU or CPU seems like a renderer for those with a render farm. I’m back to redshift and will be testing arnold again soon! SideFX is doing a great job overall but Karma XPU is just not there yet. 6 or 7 times slower is a massive way off!
hou.f*ckatdskmaya().forever()
User Avatar
Member
355 posts
Joined: Nov. 2015
Offline
alexwheezy
Hi, I decided to do a render test of the content library examples too. I got the following render times from the bubble wrap example.

Default settings from the scene:

XPU: 11:15 min.
Path traced samples: 1024
Contribution of each device to the render:
Nvidia GTX 3070 Ti: 49%
Embree CPU Device: 51%

CPU: 23:25 min.
Primary samples: 128

GPU only (KARMA_XPU_DISABLE_EMBREE_DEVICE=1): 21:25 min.
Path traced samples: 1024

My workstation configuration:

Houdini: 20.0.506 Production
OS: Manjaro 23.1.0 Vulcan
Kernel: x86_64 Linux 6.1.62-1-MANJARO
CPU: AMD Ryzen 9 5950X
GPU: NVIDIA GeForce RTX 3070 Ti
Driver Version: 535.129.03

XPU at 11 mins? could that be a linux thing? Is it slower on windows? and CPU is doing half the work. Your GPU only is still better than my 3090. Why driver 535.129.03? is that the latest on linux?
hou.f*ckatdskmaya().forever()
User Avatar
Member
201 posts
Joined: Jan. 2013
Offline
traileverse
XPU at 11 mins? could that be a linux thing? Is it slower on windows? and CPU is doing half the work. Your GPU only is still better than my 3090. Why driver 535.129.03? is that the latest on linux?

Yes, the render engine on the XPU rendered the picture in 11 minutes. I can't say about peculiarities of XPU work on Windows or Linux, I haven't made tests on my configuration on Windows yet.
The driver used is the latest stable from the repository.
Edited by alexwheezy - Nov. 13, 2023 07:41:26
User Avatar
Member
180 posts
Joined: Aug. 2018
Offline
Thanks for the useful comparison Yader. Would you be willing to share that Redshift demo scene - just to make sure all settings are the same - I'd like to run it with my 5950x / 2x 3090 rig.
User Avatar
Member
33 posts
Joined: Feb. 2016
Offline
Of course, here you go.


https://t1p.de/bryyf [t1p.de]
https://behance.net/derya [behance.net]
User Avatar
Member
33 posts
Joined: Feb. 2016
Offline
Here's another test, where Karma is just a bit slower, but looks better overall.
The material settings are the same, as well as the light setup. The backgroundplate
seems not to work in Karma XPU, so I had to create a grid with an emission not affecting
indirect rays. The thin walled option together with SSS here works really well in Karma.
Redshift was at 38sec
Karma was at 47sec
Edited by Yader - Nov. 13, 2023 12:22:40

Attachments:
Balloon_Inflate_v001_0060.jpg (210.3 KB)
Balloon_Inflate_v001.karma1.0060.jpg (703.3 KB)

https://behance.net/derya [behance.net]
User Avatar
Member
7770 posts
Joined: Sept. 2011
Offline
alexwheezy
traileverse
XPU at 11 mins? could that be a linux thing? Is it slower on windows? and CPU is doing half the work. Your GPU only is still better than my 3090. Why driver 535.129.03? is that the latest on linux?

Yes, the render engine on the XPU rendered the picture in 11 minutes. I can't say about peculiarities of XPU work on Windows or Linux, I haven't made tests on my configuration on Windows yet.
The driver used is the latest stable from the repository.

yeah, something sounds way off. It takes 30 minutes to render with a 3090 doing 90% of the work on Windows. A 3090 is probably 2x as fast as a 3070Ti in rendering too. Also 51% work done by a 5950? I don't buy it. A 3070Ti gpu is at least a few times faster than that CPU. Maybe you rendered it to viewport size and not to MPlay at 1920x1080?
Edited by jsmack - Nov. 13, 2023 12:50:47
  • Quick Links