Houdini 18.0 Executing Tasks

Partitioner Node Callbacks

Partitioner nodes group multiple upstream work items into single partitions.

On this page

Overview

Partitioner nodes are the mechanism that a PDG graph uses to group together multiple upstream work items. These groups of work items produced by the node are called partitions, which are themselves a special type of work item that directly depend on the items in that partition. Partitions also inherit their attributes and output file list from those work items. The Advanced tab of each partitioner node has parameters for controlling how upstream attributes are copied into the partition.

PDG includes a number of a built-in partitioner nodes that you can use to group work items by properties such as their attribute values, indices, frames, or node topologies. You can use the Partition by Expression or Python Partitioner nodes to write custom partitioning logic for cases that aren’t handled by the nodes that ship with Houdini. You can also write your own custom partitioner nodes as standalone Python scripts.

Much like a processor node, a partitioner can be either Static or Dynamic. Static partitioners perform their grouping logic during the static cook pre-pass. The input to a static partitioner is the list of all static work items across all input connections. If a static partitioner has an input node that is dynamic, it skips that node and traverses upwards until it finds a node with static work items. Dynamic partitioners evaluate their grouping logic once all input nodes have generated their work items. Consequently, this means that a dynamic partitioner has to wait for all nodes two levels higher to be cooked before partitioning its input work items.

Splitting by Attribute

Before a partitioner runs the onPartition callback for the input work items, PDG has the option to split those work items based on an attribute value. When the Split by Attribute parameter is turned on for a partitioner, PDG calls the onPartition method once for each unique value of the split attribute. The method is called with the list of input work items that have that particular attribute value. PDG ensures that partitions created in those calls have distinct indices so that work items with a different value for the split attribute are never in the same partition.

Work items that are missing the attribute can either be added to no partitions or all partitions. This feature is available on all nodes by default and requires no extra work by the node authors. You can also handle the missing attribute case in your own code by selecting the appropriate option for the Missing Attribute parameter on the Python Partitioner or on your custom node. The underlying API function for splitting work items is also available from Python.

When the attribute splitting option is turned on, there are several extra properties available on the pdg.PartitionHolder passed into the onPartition callback. For example, the holder can provide the name and current value for the split attribute. For additional information, please refer to the API documentation for that class.

The partitioner node loops through the list of input work items once to build the data structure used to split by attribute and detect work items that are missing the attribute. This is much more efficient than attempting to achieve the same behavior with multiple TOP nodes. However, the attribute splitting functionality performs better if there are a small number of unique values compared to the total number of work items. Each unique value results in a separate onPartition callback invocation, which has a performance cost that is offset by doing a large amount of work in the callback. For example, if you have 100,000 work items and 80,000 unique attribute values, it may be better to use a Partition Partitioner node to write custom logic instead.

Partition Attributes

Partitioner nodes currently are not able to add custom attributes to partitions. Partitions inherit their attributes and output files from the work items in the partition based on the parameters on the Advanced tab of the node. If the Merge Input Attributes parameter is turned off, then the partitions do not inherit any attributes. However, they will still have all of output files from the items in the partition copied to its own output list. If the merge parameter is turned on, then attributes from the work items are merged into the partitions' attributes. The documentation for each partitioner node includes more details on the purpose of each parameter. For example, the Python Partitioner.

Merging works by first sorting the work items based on the sort parameters on the partitioner node. PDG then iterates over the sorted items and copies attribute values from them to the partition. If an attribute already exists on the partition, then it is ignored. For example, if all of the work items have the same set of attribute names, then only the attribute values from the first work item in the sorted list is copied onto the partition. If the second work item in the sorted list has an attribute that the first item does not have, then that attribute is also copied and so on. The sorting order in the merge process also determines the order of the output files on the partition.

Node Callbacks

Partitioner nodes have a single callback method that receives the list of upstream work items as an input. The callback function is expected to return a pdg.result value that indicates that status of the partitioning operation.

onPartition(self, partition_holder, work_items)pdg.result

This callback is evaluated once for each partitioner during the cook of a PDG graph, or once for each unique attribute value (if Split by Attribute is turned on). If the partitioner is static, the callback is run during the static pre-pass. Otherwise, it is evaluated during the cook after all input work items have been generated. The list of upstream work items eligible for partitioning is passed to the function through the work_items argument. The partition_holder argument is an instance of the pdg.PartitionHolder class and is used to create partitions.

Each partition is defined using a unique numeric value supplied by the onPartition function. Work items are added by calling the addItemToPartition function with the work item itself and the partition number:

# Add each work item to its own unique partition
partition_holder.addItemToPartition(work_items[0], 0)
partition_holder.addItemToPartition(work_items[1], 1)

# Add both work items to a third, common partition
partition_holder.addItemToPartition(work_items[0], 2)
partition_holder.addItemToPartition(work_items[1], 2)

You can add a work item to multiple partitions or none of the partitions. Sometimes a node may wish to add a work item to all partitions before it knows how many partitions will be created. The addItemToAllPartitions method marks a work item as belonging to all partitions and includes ones that are added after that call is made.

You can also mark a work item as a requirement for the partition. If that work item is deleted, the entire partition is also deleted even if other work items in the partition still exist. For example, the Partition by Combination uses this behavior when creating partitions from pairs of upstream work items. If one of the work items in a pairing is deleted, the partition is no longer valid because it no longer represents a pair.

The following code is a possible implementation of an onPartition function that forms a partition for each unique pair of input work items:

partition_index=0

# Outer loop over the work items
for index1, item1 in enumerate(work_items):

# Inner loop over the work items
    for index2, item2 in enumerate(work_items):

        # We want to have only one partition for each pair, no matter what
        # the order. If we don't have this check we'll get a partition for
        # both (a,b) and for (b,a).
        if index2 <= index1:
                continue

        # Add both items to the next available partition, and flag the items
        # as required
        partition_holder.addItemToPartition(item1, partition_index, True)
        partition_holder.addItemToPartition(item2, partition_index, True)

        partition_index += 1

Executing Tasks

Basics

Beginner Tutorials

Next steps

Reference

  • All TOPs nodes

    TOP nodes define a workflow where data is fed into the network, turned into "work items" and manipulated by different nodes. Many nodes represent external processes that can be run on the local machine or a server farm.

  • Processor Node Callbacks

    Processor nodes generate work items that can be executed by a scheduler

  • Partitioner Node Callbacks

    Partitioner nodes group multiple upstream work items into single partitions.

  • Scheduler Node Callbacks

    Scheduler nodes execute work items

  • Custom File Tags and Cache Handlers

    PDG uses file tags to determine the type of an output file.

  • Python API

    The classes and functions in the Python pdg package for working with dependency graphs.