Karma XPU feels really slow!

   13557   72   10
User Avatar
Member
342 posts
Joined: Feb. 2017
Offline
I've also personally found xpu to generally be faster than redshift, however in some cases, slower to clean up frames (which makes sense as rs has adaptive sampling and bucket mode and rs is just progressive). Rs also of course has the IPC pass which helps clean up GI bounces.

That said, the big thing that killed me with RS was an utter lack of interactivity. Karma having both a great cpu implementation, as well as XPU being built from the ground up with embree and optix, means that you get a much better time to first pixel than rs, as well as more viewport interactivity, and less crashing (in my experinece). plus i've found it to be much less fiddly with drivers and not having to deal with a third party engine has just been a blessing.

i'm personally not prioritizing final render speed over those other things anymore, but i definitely understand if you are, and if thats the case, maybe its worth waiting some more for adaptive sampling to come to xpu at least.

for me though i definitely cant go back to redshift, ive found xpu to be a godsend. I don't tend to do a ton of stuff with heavy refraction though, and I do do a lot of furry characters, deforming motion blur, and sss, so i guess its just worked out better for me.

one thing i will add though on the topic of drivers, I was a fan of xpu, and I recently updated drivers and found it to be sooo much faster and more responsive.

defininitely looking forward to xpu maturing more, it seems to have the most momentum of any renderer in my eyes which is definitely largely due to the fact that it is relatively new, but I am super glad to see xpu very rapidly improving.
http://www.christophers.website
User Avatar
Member
98 posts
Joined: Aug. 2015
Online
Exactly this!

Right tool for each task. Pick your fights and see what will get you to the goal easier, with less crashing and nerve wracking
Also RS has bin developed since 2013 if I remember correctly. That put's years of optimization in there. With Karma XPU young as it is I'm more then happy with results and progress so far. Even running into some things that I'm not sure if they are due to my lack of skill with it or bugs hehe but pushing though them one at the time
User Avatar
Member
355 posts
Joined: Nov. 2015
Offline
The more I use XPU the more I know that after my current RS subscription is up I’m done with it. As you rightly said, XPU is quite new compared to other renderers but the stability over RS is just too big of a draw to ignore. Some scenes might be slower, but rendering and denoising at higher res than final output has been good so far.

Adaptive sampling should be huge and I look forward to all the continuous performance improvements.

Had my doubts but Karma is definitely growing on me! It’s also fantastic with volumes so far.
hou.f*ckatdskmaya().forever()
User Avatar
Member
27 posts
Joined: March 2021
Offline
ali_f
Hi Mirko,

You may try the modified hip file from this post [www.sidefx.com] where the portal geometry is added to improve the dome light sampling.

The denoiser can be enabled via
- Karma Render Settings (node) > Image Output > Filters > Denoiser , or
- Display Options > Enable Denoising

There are two denoisers available: Optix (interactive), Intel OIDN (only applies after render is completed).

don't do they give blobby results when rendering animation?
User Avatar
Member
27 posts
Joined: March 2021
Offline
jsmack
I went and tested the bubble wrap scene with Cycles, I remembered it having similar raw performance when I tested it a few years ago. Although it doesn't have a thin transmissive material, it can simulate it with the solidify modifier by making the surface double walled. After testing, I think there's something broken with XPU, performance wise. The raw sampling speed with cycles seems to be 30x higher than with XPU. I couldn't get the result to look exactly like the XPU one though so maybe there some incorrect thing cycles is doing that makes it faster. 4096 samples takes on the order of 3-5 minutes for the bubble wrap scene depending on how many bells and whistles you enable on the material. XPU was on the order of 30 minutes for 1024 samples with the absolute most basic shader.

this is the result with Cycles. It seems to be losing a lot of energy when it stacks up in depth, even though I have indirect unclamped and 24 bounces. is it just fast because it's cheating and wrong? I would think even 4096 wrong samples would take longer than 1024 good ones.
Image Not Found

this has no dof? in karma it is standard checked on
User Avatar
Member
27 posts
Joined: March 2021
Offline



Redshift vs Karma. Thats a huge difference. Can't wait when adaptive sampling is there
Edited by nickfr - Jan. 2, 2024 13:00:54

Attachments:
Screenshot from 2024-01-02 18-58-30.png (3.5 MB)
Screenshot from 2024-01-02 18-58-47.png (3.2 MB)

User Avatar
Member
273 posts
Joined: Nov. 2013
Offline
nickfr
don't do they give blobby results when rendering animation?
No, not if configured properly. Animated example here [rmanwiki.pixar.com].

A blobby result is sometimes indicative of excessive noise in the input.
User Avatar
Member
98 posts
Joined: Aug. 2015
Online
Honestly I'm still fighting to get smooth animation render. Even increasing samples to pretty high number I still have a bit of noise and after turning on denoiser it is that nasty splotchy animated noise that is mostly seen on static or low moving objects.
I was gonna test lighting with physical sky and see what happens, this one was with HDR and it is so noisy...
User Avatar
Member
3 posts
Joined: Nov. 2023
Offline
米尔科·扬科维奇
Honestly I'm still fighting to get smooth animation render. Even increasing samples to pretty high number I still have a bit of noise and after turning on denoiser it is that nasty splotchy animated noise that is mostly seen on static or low moving objects.
I was gonna test lighting with physical sky and see what happens, this one was with HDR and it is so noisy...
https://www.sidefx.com/forum/topic/86908/?page=1#post-375636 [www.sidefx.com]
User Avatar
Member
98 posts
Joined: Aug. 2015
Online
One more thing I did see, I do need to retest to confirm is when I switched to physical sky, render time got much better compared to HDRI on dome. BUT also when I blurred HDR map instead of using original one, render times went down as well as much less noise compared to original HDR map on dome. Well that enable me to go with much less samples as well with blurred vs original HDR.
But that means that any original sharp HDR image is nearly impossible to properly sample and get nice smootoh render?
User Avatar
Member
22 posts
Joined: Oct. 2015
Offline
Just to chime in on the "faster o no faster than Redshift", in my experience Karma XPU is alot faster.. Of cource not in every case, but doing tons of particles, pyro, the difference in speed is substantial and I left Redshift because of this. Would of cource be a welcomed addon to also pair RS in very refractive environments but overall I think XPU is getting better by the day which is great to see. Its been my main render for over a year now
User Avatar
Member
9 posts
Joined: Jan. 2017
Offline
Renderman 26 w/ XPU is out for Houdini 20 and is insanely fast on scenes that choke Karma XPU (or that it can't handle the shaders in at all). Their sample "Eisko Louise" scene is a 3D-scanned face model with 2GB of textures and 1GB of misc. that ends up at almost no noise pretty quickly.

An even faster renderer is Luisa Render [github.com]. It's unfortunately not easily usable with anything Houdini will export yet so I had to hack together a similar scene. I baked out the bubble wrap to .obj after adding two more levels of subdivion in the network for a 1.7M poly model. To bias things horribly in the favor of karma, I ran it with a spectral sampling model, used a 530GB 24k x 12k HDRI background along with a physical sky, and assigned 3 different 4k PNGs anad 2 EXR files (one emission map) as materials on the background plane. Instead of the 3 rectangle lights in the original, I threw 2 spherical object-based emitters in. The bubble wrap material ended up as a layered material with the same dielectric glass with a complex IOR (build in SF5 glass) and a small thickness of 0.0001 to attempt rendering the actual surfaces properly rather than as a solid object interaction, and was sampled at 4x the rate of the rest of the scene per pixel with an additional 16 ray recursion depth. The materials were set to 0.0001 reflect weight / 0.9999 transmit weight and slight roughness. The main sampling settings were WavePath sampler with 16384 samples / pixel with a depth of 24 and ray recursion depth of 16 and rendered out at 1920x1200, which took around 3:50s with the camera shoved up against the object to maximally abuse things... as far as I can tell it converged. I didn't run any smoothing. The bubblewrap scene on karma XPU with constant folding and multithreaded optix compile turned on took 8:20 and was still noisier than heck in the areas between "bubbles".

Zoomed out to roughly where the camera in houdini was from the model, the render time is closer to 2 minutes. Luisa is GPU-only. Scene shader program compilation for the CUDA backend took about 3 seconds. There's also a DirectX 12 backend which I used to compare the speed of the 7900XTX on identical scenes and renderers which isn't really possible anywhere else. I had to hand-edit a pseudo-json scene file with some odd conventions for coordinates to get it set up which was pretty painful but still not as bad as trying to do anything in Blender.

I couldn't try any sort of equivalent materials in karma xpu since the dielectric glass doesn't appear to work (the BDSF it outputs needs to be plugged into a MaterialX Surface to convert it to the output type karma wants and the Surface node isn't supported on XPU, just CPU. The layering thing is also out but I'm not sure that it really helped. That seems like an oversight or I'm doing something badly wrong since afaik the surface node just takes a BDSF and folds some global data from the object into it and spits it out. I don't know if the dielectric BDSF is supported either though. The only rough equivalent also requires CPU and turning on the dispersion in a surface builder's transparency

This is one of the research renderers that crops up every so often, never gets real support in any software, and slowly dies. In the current state several of the more advanced integrators don't work right or I'm not using them right I suggested that they aim for USD support to gain compatibility with Houdini and they said they had already been considering it as an option so it could be used with Blender, so there is some hope for at least that.

Anyway I attached the result image from it
Edited by GnomeToys - April 17, 2024 16:54:47

Attachments:
bubblewrap.png (3.0 MB)

User Avatar
Member
9 posts
Joined: Jan. 2017
Offline
I'd forgotten that the layered material kinda messed up the spectral splitting, but I have a render from further away without the layering or the x4 sample count on the bubblewrap material. Apologies for weirdness with color, there's some bizarre purple shift that shouldn't exist that only happens on the bubblewrap that needed to be cleaned up, and the ground plane is positioned badly but meh, I wasn't trying for pretty here, just a speed check. Typing scene files in a format you don't know that can't decide which direction is up (had to resort to setting an initial transform on everything that pointed their forward and up vectors in the same direction) isn't my idea of fun. If you zoom in you can see that it didn't quite converge in the 2 minute render time but it came close to "acceptable noise" levels by my eyes aside from the stray bright blue firefly. It's possible from the fine structure of the spectral splitting in certain spots that the minor noise is a result of sampling above the nyquist limit on the light spectrum and it is accurate, or that another integrator would work better on this scene. One of the samples uses the MegaPath integrator at 65536 spp but no examples exist for many others so who knows. Only one of their example scenes took longer than 2 minutes on the RTX4090 and only after I raised the resolution to 4k... oddly the 7900XTX with the DX12 backend was estimating a lower render time by half on that one. I checked the 4090 on DX12 instead of CUDA and there was no difference.

One last thing I've noticed is that for certain materials, usually when heavy refraction is involved, Karma CPU is faster than XPU. It seems like if the MtlX Tiled Image node is used it might be using an image size * tile count amount of memory. If I throw maybe 5-6 2k textures on a single Karma shader but tile them at a high level I can easily go over 24GB. That makes more sense for something like the hexagon tiling node that would benefit more from baking out the mess of splotched together parts it generates but I can't see how a normally tiled material that just repeats over UVs would need it. Of course I like displacement maps a bit too much so I could just be accidentally dicing the geometry into trillions of parts, but it seems like that isn't the only time it happens.

Attachments:
bubblewrap_luisa2.png (4.0 MB)

  • Quick Links