usd performance strategy

Forums Solaris and Karma usd performance strategy

5424 17 4


blakshep: Member; 77 posts; Joined: Oct. 2016; Offline

Feb. 15, 2022 2:03 p.m.

Hello,

seems I only have usable performance when i modify my scene at the very end of my tree.
Like for example if i try to adjust camera but it gets super slow cause later got lot of geometry merged in later. I got very good performance when i modify camera after from the node im viewing my scene with all the set merged in.
So far im just copy node back to tree when i like the result, but this doesn't feel like intended way to use this,also it could be my geo path has changed therefore things like "edit" node stops working
How are you dealing with this issue?


jsmack: Member; 7771 posts; Joined: Sept. 2011; Offline

Feb. 15, 2022 2:52 p.m.

I use a merge node at the end to combine camera and light layers. That way I'm not modifying anything like that at the top of the tree.


blakshep: Member; 77 posts; Joined: Oct. 2016; Offline

Feb. 16, 2022 1:27 p.m.

thanks! Yeah that's what i did so far but thought im doing something wrong.
Like i was expecting it not slow down as the rest of network is unchanged im just changing a camera transform for example?
Is this a design flaw? A bug ?is this how it supposed to work?


antc: Member; 273 posts; Joined: Nov. 2013; Offline

Feb. 16, 2022 5:13 p.m.

blakshep
thanks! Yeah that's what i did so far but thought im doing something wrong.
Like i was expecting it not slow down as the rest of network is unchanged im just changing a camera transform for example?
Is this a design flaw? A bug ?is this how it supposed to work?

On the lops side that's pretty much how it's supposed to work. Lop nodes are free to change the incoming stage in whatever way they want. It's kind of the price of having a system that's fluid and easy to wire together. The alternative is like maya where dependencies are managed at a finer level (attributes) giving the system a better idea of the real input and output data. The downside of course is building graphs is harder and usually requires scripts to configure anything substantial. Keeping unrelated nodes in separate branches is therefore the best tactic to keep things fast.

On the usd side, operations that alter the composition of the scene (referencing, setting variants etc) are significantly more expensive than setting properties on prims. So it's worth trying to minimize how often composition is firing. Lastly, composition can't vary over time in usd, so try and make sure that nothing composition related is happening in a time-dependent lop.

Edited by antc - Feb. 16, 2022 17:48:13


goldleaf: Staff; 4164 posts; Joined: Sept. 2007; Online

Feb. 17, 2022 11:02 a.m.

Related to the already excellent suggestions: I suggest working to avoid Houdini Time Dependencies wherever possible, especially ones high-up in your node network. These are the bright-green clock badges that show up on nodes in Houdini. The behavior isn't new to Houdini, it's just easier for time-changes to cause pain in LOPs, compared to SOPs. Even innocent nodes like Render Settings, which can be time dependent only to set the product name (file path), can end up causing huge amounts of your LOP network to re-cook on every frame, even if it doesn't need to.

You can reduce/avoid these by using appending a Cache LOP right after one or more time-dependent nodes(set it to cache all frames), or by writing to disk and layering back in.

I'd also recommend going to Edit > Preferences > Lighting, and turning Off the preference "Panes Follow Current Node". While it's convenient sometimes to be able to select a LOP node and see the Scene Graph Tree reflected at that node's position, sometimes the work USD/Solaris have to do, in order to draw that scene graph can lead to some performance issues. With that preference turned off, you need to move the display flag, to inspect the stage at that location in the SG Tree, Viewer, SG Details, etc...

I'm o.d.d.


Tim Crowson: Member; 240 posts; Joined: Oct. 2014; Offline

Feb. 17, 2022 12:11 p.m.

goldleaf
I'd also recommend going to Edit > Preferences > Lighting, and turning Off the preference "Panes Follow Current Node". While it's convenient sometimes to be able to select a LOP node and see the Scene Graph Tree reflected at that node's position, sometimes the work USD/Solaris have to do, in order to draw that scene graph can lead to some performance issues. With that preference turned off, you need to move the display flag, to inspect the stage at that location in the SG Tree, Viewer, SG Details, etc...

I didn't know about this one. This is great!

In general, this entire topic is one we have been dealing with as well. Solaris seems to assume that all nodes below the one we're currently editing must be recooked, even if the nodes deal with things that have zero bearing on each other (I just edited the spec roughness on a shader inside a material library, why are all my collections and prunes and lights recooking?). I understand that this is by design (and indeed, considering how complex Houdini's node interactions can potentially be, how could it possibly manage the dirty states effectively?) but this is probably the greatest workflow barrier we have run into. It forces us to do certain operations at the end of the graph for feedback purposes, then relocate those nodes into a more preferable place in the graph for pipeline reasons.

Since serial graph construction gives us this headache, we parallelize what we can. But not everything can be parallelized. Many LOP nodes require the incoming stage as a source input which it evaluates (like the Collection LOP for example, and many more). So we get into games of "can we parallelize this part of it at least?" - "but you can't just merge that willy-nilly, because of layering orders, etc." - "And these nodes have to stay in a series, we can't change those".... Parallelization has real limits in this context.

Having come from Katana, many of us fully expected to mind the order of nodes in a serial graph, but we didn't expect the added ramifications of Solaris's "recook everything below this" behavior on graph design. I personally really love how flexible Solaris is... it has a lot more options for directing graph flow and logic, I just wish that flexibility didn't come at the expense of performance. But that said we're still able to produce content on a deadline.

- Tim Crowson
Technical/CG Supervisor


blakshep: Member; 77 posts; Joined: Oct. 2016; Offline

Feb. 17, 2022 3:56 p.m.

I did notice render settings often takes ages to cook compered to what would i expect. Good to know why!
Agree with Tim i believe this should be managed somehow better.

Like we have the insertion points which as far as i understand are for editing nodes an earlier stage of the tree, except you cant really modifier anything that's not the end of the tree cause you will get 0.1 fps to do it depending on the scene.

Had often similar scenarios like Tim where i just want to change a diffuse color of some objects but have an entire forest recooked with every little change of parameters

thanks for all the replies!


Tim Crowson: Member; 240 posts; Joined: Oct. 2014; Offline

Feb. 17, 2022 7:28 p.m.

blakshep
Like we have the insertion points which as far as i understand are for editing nodes an earlier stage of the tree

Insertion points are for telling Houdini where to place new nodes of a certain type.

Edited by Tim Crowson - Feb. 17, 2022 19:28:23

- Tim Crowson
Technical/CG Supervisor


blakshep: Member; 77 posts; Joined: Oct. 2016; Offline

Feb. 18, 2022 11:51 a.m.

yeah well they for putting nodes not where you are currently in tree, which is very slow in solaris therefor cant feel at moment i can use it.

Also im having an issue right now, im editing a layout with edit node at the end of my tree, but after every single time i move something houdini is idle for 5 seconds cause its merging in a large fx file that is pruned earlier cause not visible from that view.
Thought best to do in solaris to have build with everything included that could be needed and just prune out everything not needed in the shot. I could like just merge the elements i need for every shot but that would sure make messy unreadable tree.

Any advice? Besides temporary plugging out heavy elements?


antc: Member; 273 posts; Joined: Nov. 2013; Offline

Feb. 18, 2022 5:06 p.m.

Any chance you can post some screen grabs of your network? It would probably make it easier to make suggestions.


arx_anima: Member; 80 posts; Joined: Aug. 2019; Offline

Feb. 19, 2022 5:10 a.m.

So far what works the best for us is to save everything that comes into the lighting file as usd layer.
If something needs to be animated in the lighting file this will also become it's own usd file, so basically cache everything.
Solaris is very very efficient when you load usd layers from disk.
And as mentioned above we also try to have edits which we often need to change (for example lights) to do them at the end.
Also one thing that cost us a lot of performance in the beginning was getting used to the usd way of inheriting attributes, meaning the more prims you edit the slower the network gets down the line, but if i can edit one prim somewhere at the top and every child is simply inheriting this, it is basically for free.

One thing which we found out is that when you use vop materials can slow down the whole network really bad when the hda's are opened as solaris writes then the vex code into the stage and this seems to take some time.


Ruspa: Member; 30 posts; Joined: Feb. 2021; Offline

March 2, 2022 6:37 a.m.

Interesting points in the conversation which is in line with my brief experience on solaris, I agree, the node network seems to be getting slow depending on how the graph is put together and which node you are using. By the fact that solaris will allow you to manipulate graph extensively, this seems to fight with performances and with consistency too, the result you get on hydra and in the stage is not always the same as what you get once you bake out into an usd file or make an usd render.

In a situation where the assets are particularly heavy I could notice a sensible performance drop, using LOD might help to visualize geometry in the viewport but not the graph, payloads might help the graph but Ive found it is quite difficult to make good use of it, or I dont know how to properly use it.

As already mentioned in the thread, I also noticed that using some nodes can slow down your graph for example: if you are using and graft node to append a primitive to another existing geo primitive, it will affect the graph performance, and also will have a side effect on usd render submission, it will force usd rop to rebake the entire geo into the usd file (which it will take long time) and therefore not take advantage of sub-layering

Another examples: render var nodes are quite slow to parse in my experience, but possibly the most impacting is the filtering: how many shader assignments you have in the scene, assign primvars to geometry, etc... ... the time dependencies are another factor. hydra render does not seem to perform very well, especially if you are using gpu based render, the vram has a big impact on performances of the graph. So doing graph manipulation while rendering can become a challenge.


leoYfver: Member; 31 posts; Joined: July 2015; Offline

March 2, 2022 7:54 a.m.

Thanks for this thread, alot of good tips!

another part that for me is very slow is the materiallibrary. It evaluates all the time and takes quite some time to do the whole evaluation. If i create a new node the whole materiabuilder starts evaluating again and that takes some seconds, removing or connecting the node and the scene locks up again to evaluate. So i cant really do lookdev unless i set the scene to Manual evaluation.

Is ther some plans to make this more effective like dirty tagging nodes or does it need to reevaluate everything?

cg supervisor @goodbyekansas


Tim Crowson: Member; 240 posts; Joined: Oct. 2014; Offline

March 2, 2022 5:09 p.m.

Your best bet for now is to not place your material_library node in line with the rest of the graph, but to instead have it disconnected from the incoming stream and merge it in (unless for some reason it requires an incoming stage, which is uncommon). This should prevent it from recooking.

Edited by Tim Crowson - March 2, 2022 17:10:08

- Tim Crowson
Technical/CG Supervisor


leoYfver: Member; 31 posts; Joined: July 2015; Offline

March 3, 2022 3:04 a.m.

Thanks for the tip! Seems though i still get the same slowness as soon as i merge it into the tree. Tried to separate to multiple material libraries and then merge it. That gave some speedups when only haveing the viewflag on the specific material library, as soon as the viewflag was in the tree i experience the same slowness again.

cg supervisor @goodbyekansas


leoYfver: Member; 31 posts; Joined: July 2015; Offline

March 3, 2022 5:03 a.m.

I did some more testing and in my case it seems to be more about how many shaders need to be translated into the usd stage. In the example attached i only have a materialibrary with a bunch of principled shaders inside a materialbuilder, no other nodes. So if i have the shaders but dont read them into the usd stage the snappiness is back.

Edited by leoYfver - March 3, 2022 05:03:53

Attachments:
solaris_materiallibrary_slowness.gif (7.5 MB)
solaris_materiallibrary.hip (2.3 MB)

cg supervisor @goodbyekansas


jsmack: Member; 7771 posts; Joined: Sept. 2011; Offline

March 3, 2022 12:51 p.m.

leoYfver
I did some more testing and in my case it seems to be more about how many shaders need to be translated into the usd stage. In the example attached i only have a materialibrary with a bunch of principled shaders inside a materialbuilder, no other nodes. So if i have the shaders but dont read them into the usd stage the snappiness is back.

You're adding nodes to the material library though. of course it has to retranslate to the stage. There's nothing you can do here except don't put a bunch of materials into a library and then edit them there. I would also not use material builders if you don't have to. principled shaders can be translated directly to usd prims, whereas material builders must create code. The code can be cached, but translation would need to happen after any change.

Another option to explore is materialX shaders. they can be translated to shade networks more directly, as they don't create any code so updates should be more responsive.


leoYfver: Member; 31 posts; Joined: July 2015; Offline

March 4, 2022 2:27 a.m.

jsmack
You're adding nodes to the material library though. of course it has to retranslate to the stage. There's nothing you can do here except don't put a bunch of materials into a library and then edit them there. I would also not use material builders if you don't have to. principled shaders can be translated directly to usd prims, whereas material builders must create code. The code can be cached, but translation would need to happen after any change.

yes, thats what i wrote. The problem is that it happens when creating or deleting a node wherever in the materiallibrary, for example a null node inside a material builder which is not connected will also take seconds to translate. Im not whining just pointing this out a slow part of our workflow.

It doesnt matter if its a materialbuilder it is the nodes that needs to be translated. We use vray and we also use collect nodes with both vray shaders and usdpreview surface connected and it adds up to quite alot shaders quite fast which makes it slow for us.

edit1:
seems actually mtlx subnet is quite faster. But its also because its leaner and seem to only promote the parameters when they are changed from default. unfortunately mtlx is a no go for us for now

edit2:
allright, seems like there is actually 2 different things happening. 1 is vex shaders that needs to translate as jsmack said and the other one which we seem to have issue with is when its alot of non default parameters so it needs to list them all in the graph. Some renderers seem to only list shaders which have non default values. So if for example you have a mtlxsubnet with disney15 surface shader and duplicate that 30 times it will be very snappy, but if you change all the values to non default the slowdown starts to happen.

Edited by leoYfver - March 4, 2022 03:16:23

cg supervisor @goodbyekansas

Quick Links

                    
                        Search links
                        Show recent posts
                        Show unanswered posts