HDK
 All Classes Namespaces Files Functions Variables Typedefs Enumerations Enumerator Friends Macros Groups Pages
Writing a Partitioner

Overview

Partitioner nodes define logic that operate on a list of work items, and group the work items into subsets based on a common property. For example, work items might be grouped together by an attribute or frame value, possibly based on node parameters.

In order to write a custom partitioner, you should create a class in C++ that derives from PDG_NodeCallback. The class will need to implement the PDG_NodeCallback::onPartition() method, which PDG will invoke when the node should partition a list of upstream work items. The method is called with a temporary PDG_PartitionHolder object used to construct the partitions, the PDG_WorkItemArray containing all of the work items from the input node(s), and an optional target PDG_WorkItem if the partitioner is performing targeted partitioning.

The method can use whatever logic or parameter evaluations it needs in order to fill the holder with partitions. It is only allowed to add work items from the input list to partitions – it is invalid/an error to reference any other work items from within the onPartition callback.

Partitions are defined by an index value starting from 0. It's also possible to add a work item to all partitions using the PDG_PartitionHolder instance passed to the callback function, without needing to manually loop through the entire index space.

Callback Invocation

Partitioner nodes have one callback function, PDG_NodeCallback::onPartition(), which is invoked whenever PDG needs to partition a list of input work items. The "Partition When" parameter added to all partitioner nodes by default determines whether the node should process input work items once they're generated, or wait for them to be cook before partitioning them. Through the parameter interface it's also possible to specify a "target" node – the partitioning logic will be applied to work items in that node instead, but the final partitions will still end up depending on direct inputs.

Under most conditions, the partitioning callback is only invoked once per graph cook after all of the input work items are ready. However, if the "Split by Attribute" toggle is enabled on the node, the input work items are split into groups based on their distinct value for the specified "split" attribute. The PDG_NodeCallback::onPartition() callback is invoked separately for each of those groups of work items, and PDG will makes sure the partition indices are flattened.

The callback implementation must be thread safe, since it will run on a background thread during the PDG graph cook. PDG ensures that all of the input work items are safe for read access when the callback evaluates, but it is an error/unsafe to attempt to write to an input work item during the partitioning callback. It is also unsafe to access any other work items or nodes in the PDG graph.

Partitioning Work Items

Work items can be added to partitions through a partition holder by calling PDG_PartitionHolder::addItemToPartition(). The work item is passed directly to the method, along with an index that determines which partition the work item should be added to. Defining an indexing scheme is the responsibility of the partitioner node itself.

As a basic example, the following implementation of onPartition adds all upstream work items to same partition with index=0:

PDG_CustomPartitioner::onPartition(
const PDG_WorkItemArray& work_items,
const PDG_WorkItem* target)
{
UT_WorkBuffer errors;
for (auto&& work_item : work_items)
{
if (!holder->addItemToPartition(work_item, 0, false, errors))
{
myNode->addError(errors.buffer());
}
}
}

Registering the Node

The shared library that contains the custom node will need to have a registerPDGTypes function, which is responsible for registering any custom PDG nodes. The PDG/PDG_PartitionByParity.C and PDG/PDG_PartitionByParity.h example shows how this might be implemented:

{
PDG_NodeInterface partitioner("partitionbyparity");
partitioner.addParameter(
PDGT_Value::eString, "attributename", "Attribute Name");
registry->registerPartitioner<HDK_Sample::PDG_PartitionByParity>(
"partitionbyparity", "Partition by Parity", "Partitioners", partitioner);
}

Performance

For comparison, a Python implementation of the same partitioner node is also included in the HDK samples as PDG/partitionbyparity.py. Python-based nodes are generally slower than C++ nodes, and cannot evaluate in parallel. However, for certain operations it may be convenient to use Python and it's large standard library instead of C++.

The table below shows a performance comparison between the C++ and Python versions on the Partition by Parity example node. Each version was test with a single input node containing 100,000 work items with a randomly generated integer attribute. The table shows the total time taken to evaluate the onPartition function for each of the implementations.

File

Time

PDG/PDG_PartitionByParity.C 27ms
PDG/partitionbyparity.py 464ms