DjangoBB LoFi version

Full Version: Dependency Based on Changes in Workitem Result Between Runs

Root » PDG/TOPs » Dependency Based on Changes in Workitem Result Between Runs

9of9

Aug. 16, 2019 07:52:34

I've been trying to figure this out for a good while now and I'm really struggling at this point to wrap my head around how I even ought to approach this, at this point. If there are any good resources out there that show something like this being done, I'd love to see a link.

As a test, I'm trying to solve a really simple problem - I'm projecting roads (curves) into a grid of 4x4 terrain heightfield tiles. When I wiggle one of the roads, I want only the tiles that this road intersects with to update. Kenny Lammers' tutorials cover this pretty well overall.

My problem is, I want to bring my input curves in as a single file. I.e. within PDG I want to take a flatpack geometry file as input, figure out how many different curves there are in it, split them out into separate workitems, and then do the usual thing with partitioning by bounds and pairing up terrain tiles with their intersecting curves etc.

Fairly straightforward flow - I get the input geometry, and attrib promote the max connectivity to detail attribute, bring that detail attribute in as a PDG attribute and create an input wedge based on what I know the number of inputs is. Then I do another geometry input to create multiple workitems (in this case two) by splitting out connectivity groups from that initial flat pack.

This stuff then plugs into Partition by Bounds.

My problem is that the naive PDG flow treats that first get_Num_Inputs node as the arch dependency for everything downstream - if the input file gets changed, then everything gets dirtied, and recooked.

Instead, what I want to happen is somewhere at the get_InputGeo step to check which workitems/results have actually changed from the previous run and not dirty the results that haven't changed.

Basically - if the results in a workitem are identical to the previous run, don't dirty the downstream dependencies from those workitems, even if the upstream nodes those workitems have generated from, have recooked on this run. I kind of expected this to be default functionality, but I'm not sure if there's something I'm doing that doesn't allow that to happen, or whether there is manual setup I need to do to enable this.

Working with the Python API, I've noticed that the results tuple on workitems is meant to store a checksum for each results file - and that by default the checksum is always 0, as far as I can tell. So I've experimented with adding a Python processor node that would try to add actual checksums to the results, to see if that data gets used in the way I want it to be: presumably if your results have the same checksums as they did on a previous run, then you can assume all the downstream cooks for your workitem would come out the same, and you don't need to recook them.

But no - even if I explicitly tell PDG these files have the same checksum, it still recooks everything.

I could probably rig something up where the checksums are stored and compared purely via Python, and maybe add a logic branch to the PDG network where just the curve inputs that have changed get split out and re-projected into the terrain while other terrain tiles manually find and use the geometry cache on disk from the previous run… but I'd really rather see if there's a better way to do it, since then I'm basically rewriting a bunch of work the PDG is already doing and putting in my own dependency tracking. It shouldn't have to get that convoluted!

Perhaps this is doable via the API? You can call a function to dirty workitems in a Python processor, but I can't see if there's a way to undirty a workitem that will have already been dirtied by upstream changes.

Maybe there's some nuance of the Static/Dynamic flows I'm not understanding that's preventing this from working the way I want it to. Any ideas, anyone?

tpetrick

Aug. 20, 2019 14:07:20

When a work item depends on another upstream item, it'll be dirtied if the upstream item is dirtied. The file checksums aren't going to override that.

One option is to have your Python Processor not actually create a dependency on the upstream items, e.g. something like the following the Generate implementation:

for upstream_item in upstream_items:
    item = item_holder.addWorkItem(index=upstream_item.index)
    item.data.setFloat("example", upstream_item.data.floatData("example", 0), 0)

Note that when adding the new item, no parent is specified in the options, so a dependency link between the items won't be created. You'll probably also need to provided a custom “Regenerate Static” implementation, which gets called when a node generates but has existing items, and use that to copy any updates from the upstream items into the items in the node.

Depending on what you're trying to do, it might also be possible for us to add the necessary features to the Geometry Import node so you don't need the get_num_inputs or Wedge node at all. Can you attach an example file + input geometry?

9of9

Aug. 20, 2019 14:49:20

Interesting suggestion. I'll take a look at how far I can get with that kind of implementation - I would have expected these new workitems to always be dirty as well, but if that can be avoided, then perhaps this would work and I could add some decent way to track changes.

I can definitely supply the example file and HDAs once I'm back in the office - it's a nice little standalone test.

As a general case, if - in spite of upstream changes that have dirtied it - a workitem evaluates to have the same checksum and the same attributes, is there any reason you would want to still keep it marked as dirty? I would have expected this to be a fairly reliable and universal optimisation within PDG. i.e. an assumption that identical inputs will always give you identical outputs.

9of9

Aug. 22, 2019 05:42:16

Here's the files - there's an HDA for creating the base terrain pieces, and an HDA for projecting the curves.