Houdini 20.0 Executing tasks with PDG/TOPs

Processor Node Callbacks

Processor nodes generate work items that can be executed by a scheduler

On this page

Overview

Processors are one of the main types of node in a PDG graph. Each processor node is capable of producing new work items using upstream input work items, parameters on the node itself and/or external resources. The node is able to set attribute values on the work items, configure the command line for the work item and define if the work item should be evaluated in a child process or in the current Houdini session. The Merge and Attribute Create are simple processors that create work items with attribute values, while the ROP Fetch is a more complicated processor that creates work items that need to be scheduled for execution.

All processor nodes can be either Static or Dynamic, which determines when the processor creates work items. A static processor generates its work items during the static pre-pass that happens before the PDG graph cook. A dynamic processor creates one or more work items each time a work item in an input node cooks. This behavior is configured using the Generate When parameter that appears on all processor nodes. An instance of a processor node will only generate one type of work item, based on that parameter. It isn’t possible for a processor to have a mix of static and dynamic work items at the same time.

Registering Custom Nodes

Custom processor nodes can be registered through the PDG Type Registry. You can do so on-demand from the Python Shell, or by adding a registration script to the the $HOUDINI_PATH/pdg/types directory. When Houdini starts, PDG automatically loads all scripts and modules from the pdg/types directories that are found on the Houdini search path. For example, you can create a script to that defines a custom node class and save it as $HOME/houdiniX.Y/pdg/types/mynode.py. In addition to the node definition, you’ll also need to define a registerTypes function in the script file. The function is called automatically when PDG loads the script and is responsible for registering any node(s) defined in the .py file. For example:

import pdg
from pdg.processor import PyProcessor

# Custom node definition
class CustomProcessor(PyProcessor):
    def __init__(self, node):
        PyProcessor.__init__(self, node)

    # Parameter template name
    @classmethod
    def templateName(cls):
        return "customprocessor"

    # Parameter template definition
    @classmethod
    def templateBody(cls):
        return json.dumps({
            "name": "customprocessor",
            "outputPorts" : [
                {
                    "name" : "output"
                }
            ],
            "parameters" : [
                {
                    "name" : "customparm",
                    "type" : "Integer",
                    "value" : 1,
                    "label" : "Custom Parameter"
                }
            ]})

    # Node callback implementations should be
    # defined as member functions of the class.


# Register the custom node with PDG using a stand-alone registerTypes 
# function.
def registerTypes(type_registry):
    type_registry.registerNode(CustomProcessor, pdg.nodeType.Processor,
        name="customprocessor", label="Custom Processor", category="Custom")

See the pdg.TypeRegistry class for additional details on the registration methods, as well as other methods for querying or accessing node definitions.

Node Callbacks

Each processor node has several callbacks that can be implemented to control how it operates. When writing a processor node the only callback you're required to implement is onGenerate, since that hook is responsible for actually producing work items. If the work items are marked as in process, the onCookTask callback must be implemented in order to provide the logic used to cook the work items. Work items are marked as in process if the inProcess keyword argument is set to True when constructing the item. All other node callbacks are optional and are only used to further customize the behavior of the node.

Callbacks can return a pdg.result to indicate whether they succeeded or failed. For example, when the calback completes successfully it should return pdg.result.Success. If the callback isn’t fully implemented,and the base class version provided by PDG should be run instead, the callback can return pdg.result.Missing. The other special result types, such as pdg.result.All, are not used for processor nodes. Uncaught Python exceptions thrown during a callback will handled by PDG, and the callback will automatically be marked as failed.

If one or more processor nodes return a failure during the static generation pass, the cook is stopped in that branch of nodes. For dynamic processors a failure will not stop the cook, and work items produced by other onGenerate calls from different upstream items will continue to execute.

Warning

PDG will make sure that the attributes for any work items passed as callback inputs are safe to read from for the duration of the callback. It also safe to read and write to any work items created during the callback. It is not valid, however, to store work items to member variables on the node, or to write attributes to the upstream work items.

While any of Task callbacks are running, the Python code has exclusive access to attributes of the work item directly passed to the function. The callback code can read and writes attributes to that work item, or add output files. The callback code is free to read attributes from the parent or dependencies of that work item as well, but it should not access any other work items in the node.

onGenerate(item_holder, upstream_items, generation_type)pdg.result

This callback is evaluated any time the processor node should generate new work items. The pdg.WorkItemHolder argument is a factory object used to create new work items in the node. Work items are not added to the processor node or PDG graph until the callback successfully returns. If the callback fails, anything created during the portion of the callback that did evaluate is deleted and discarded.

Work items are added to the holder using the addWorkItem method, which can be passed either a list of keyword arguments or a pdg.WorkItemOptions helper object. The arguments are used to configure the name, index, and type of the new work item. Each of the entries listed in the pdg.WorkItemOptions API doc can also be passed as a keyword argument, using the same name:

# Using options:
options = pdg.WorkItemOptions()
options.name = "example"
options.inProcess = True
item_holder.addWorkItem(options)

# Or using kwargs:
item_holder.addWorkItem(name="example", inProcess=True)

The addWorkItem method returns the new work item, so that the callback code can set attribute values or the work item’s command line:

new_item = item_holder.addWorkItem()
new_item.setStringAttrib('example_attrib', 'example value')
new_item.setCommand('echo "hello"')

The upstream items list passed to the callback contains no work items if the node has no inputs. If the node has input nodes, the list contains one work item if the node is dynamic or the full list of upstream work items if the node is static. Finally, the generation_type argument determines whether the callback is being called on a static node, dynamic node or for work item regeneration. pdg.generationType lists the details of the enum passed to the callback function.

onRegenerate(item_holder, existing_items, upstream_items)pdg.result

This callback is called during the cook process when the node has existing work items, but a parameter has changed since the last cook. It gives the node the opportunity to dirty, delete, modify or add new work items. By default, all nodes have a standard built-in implementation that will recreate and merge the work items by calling the onGenerate hook. A regenerated work item is not necessarily dirty. If none of the work item’s attributes are modified the item is left in the same state it was in before regeneration began. The arguments to this callback function are similar to onGenerate, however the extra existing_items list contains the list of static work items already in the node. The generate_type argument is set to a value in the pdg.generationType enum. Your code can use it to determine if the node is dynamic or static.

You generally only need to implement this callback if your node is using external resources to create work items. For example, if you're creating work items by querying a resource via an HTTP request, an external database or file on disk. Alternatively, if you want your node to always attempt to regenerate work items on ech cook you can use the onConfigureNode node callback to set isAlwaysRegenerate flag. PDG will run the regeneration logic even if your node does not have an onRegenerate implementation – it will fallback to the built-in implementation if necessary. This can be useful when dealing with external resources that are challenging to track, or simply to avoid the need to write custom code to handle those resources. For example the File Pattern always regenerates work items from the file system each time it cooks.

Note

This callback replaces the older onRegenerateStatic and onRegenerateChildren methods from previous version of Houdini. Both static and dynamic work item regeneration now uses same callback, much like static or dynamic generation. If your node has one of the old callbacks, and does not have the new onRegenerate callback, PDG will use the old method and emit a deprecation warning.

onAddInternalDependencies(dependency_holder, work_items, is_static)pdg.result

This method is called to add sibling/internal dependencies between work items created during an onGenerate callback. These are dependencies that exist between work items in the same node. The input to this callback is the list of work items produced in the last onGenerate callback. Note that internal dependencies can only be added between work items that were created during the same onGenerate call. The is_static boolean indicates whether the incoming work items were created statically or dynamically.

PDG only cooks work items once all of their dependencies are cooked, so adding internal dependencies is useful if your node generates work items that need to be completed in a specific order. For example, the begin node in a feedback loop block adds internal dependencies between the iterations in the loop.

onPreCook()pdg.result

This callback is called once on all nodes that will be evaluated during a cook, at the beginning of the cook. It is called before any onGenerate callbacks and gives the node the opportunity to load shared resources or initialize variables.

onPostCook()pdg.result

This callback is called once on all nodes that were active during the cook, after the last work item has completed cooking. This gives nodes an opportunity to unload shared resources used during the cook.

onPrepareTask(work_item)pdg.result

This callback is called once for each work item in the node immediately before it is scheduled for cooking. This method is called before PDG checks for cached files on disk, so it can be used as a way to add expected output files for caching purposes. It can also be used to access output files or attributes of the work item’s dependencies, modify the command line of the work item, or do other last-minute work needed before cooking the item. If this method returns pdg.result.Failure the work item is marked as failed.

The ROP Fetch node uses this callback when running a distributed simulation to ensure that the sim tracker is running and pass tracker information to sim work items before they cook. The work items in the ROP Fetch can be generated statically and the tracker port/IP aren’t known until after the cook begins, so the tracker details are added onto simulation items using this callback.

The onPrepareTask callback is called for each subitem in a batch as well as on the batch parent itself. The order of the callback invocations depends on the activation mode of the batch work item. If the batch has its activation mode set to pdg.batchActivation.All PDG will run the onPrepareTask hook for each subitem, and then the batch itself, immediately before the batch is scheduled for execution. If the activation mode is set to one of the other options the callback will be called on each subitem once the subitem’s dependencies are satisfied. The callback is run on the batch parent once the batch is ready to execute. In other words, there is no guarantee on the order of the onPrepareTask calls when the batch is set to pdg.batchActivation.First. The callback may also be called on the batch parent between subitem calls.

onCookTask(work_item)pdg.result

This callback is called when an in-process work item needs to cook. It is only run for work items that are explicitly created as in-process items, by passing the inProcess=True argument when adding them to the work item holder during the onGenerate callback. This callback is able to write attributes to the work item and attach output results, as well as running whatever in-process logic is necessary to cook the work item. The callback runs inside the main Houdini process on a background thread.

When set to evaluate in process, the Python Script node uses this callback to execute its script.

onPostCookTask(work_item)pdg.result

This callback is called when a work item successfully cooks, but before any downstream nodes generate from the work item. It can be used to do validation of the work item’s output files or attributes, or to make changes to the attributes based on output files create during the cook. The callback is run even if the work item was not cooked in-process. If the callback returns pdg.result.Failure, the work item will be marked as failed.

This callback is only called if the work item has been marked as needing to run post cook logic, using the pdg.WorkItem.setIsPostCook method.

onSelectTask(work_item)pdg.result

This is a UI callback specific for use with TOPs. When a work item is selected in the TOPs UI, this hook is executed after the item has been selected. For example, is is used by the Wedge node to push work item parameter values to the scene, but could also be used to run custom visualization logic. Note there is also an event hook for work item selection as well. See the pdg.EventType event type list and the event handling overview for details.

onDeselectTask(work_item)pdg.result

This callback is the opposite of the onSelect callback. It runs whenever a work item is deselected in the TOPs user interface. For example, the Wedge node uses this to restore parameter values back to their original value on deselection.

onConfigureNode(node_options)pdg.result

Invoked with a pdg.NodeOptions object that you can use to describe the configuration of a node instance based on the current parameters. You can set a text description of the node or configure whether or no the node requires all inputs to be generated in order to generate. This affects the behavior of the Generate When parameter when it is set to Automatic. If the node requires inputs to be generated, the Automatic parameter will always mean All Upstream Items are Generated.

For example, the ROP Fetch node requires access to the full list of upstream items when creating batches. When the node is in batch mode it uses the onConfigureNode method to indicate to PDG that the Automatic generation option should always mean All Upstream Items are Generated.

def onConfigureNode(self, node_options):
    if self['batch'].evaluate() > 0:
        node_option.setRequiresGeneratedInputs(True)
    else
        node_option.setRequiresGeneratedInputs(False)

    node_options.setDescription("Node: " + self['targetnode'])

Executing tasks with PDG/TOPs

Basics

Beginner Tutorials

Next steps

  • Running external programs

    How to wrap external functionality in a TOP node.

  • File tags

    Work items track the results created by their work. Each result is tagged with a type.

  • PDG Path Map

    The PDG Path Map manages the mapping of paths between file systems.

  • Feedback loops

    You can use for-each blocks to process looping, sequential chains of operations on work items.

  • Service Blocks

    Services blocks let you define a section of work items that should run using a shared Service process

  • PDG Services

    PDG services manages pools of persistent Houdini sessions that can be used to reduce work item cooking time.

  • Integrating PDG with render farm schedulers

    How to use different schedulers to schedule and execute work.

  • Visualizing work item performance

    How to visualize the relative cook times (or file output sizes) of work items in the network.

  • Event handling

    You can register a Python function to handle events from a PDG node or graph

  • Tips and tricks

    Useful general information and best practices for working with TOPs.

  • Troubleshooting PDG scheduler issues on the farm

    Useful information to help you troubleshoot scheduling PDG work items on the farm.

  • PilotPDG

    Standalone application or limited license for working with PDG-specific workflows.

Reference

  • All TOPs nodes

    TOP nodes define a workflow where data is fed into the network, turned into work items and manipulated by different nodes. Many nodes represent external processes that can be run on the local machine or a server farm.

  • Processor Node Callbacks

    Processor nodes generate work items that can be executed by a scheduler

  • Partitioner Node Callbacks

    Partitioner nodes group multiple upstream work items into single partitions.

  • Scheduler Node Callbacks

    Scheduler nodes execute work items

  • Custom File Tags and Handlers

    PDG uses file tags to determine the type of an output file.

  • Python API

    The classes and functions in the Python pdg package for working with dependency graphs.

  • Job API

    Python API used by job scripts.

  • Utility API

    The classes and functions in the Python pdgutils package are intended for use both in PDG nodes and scripts as well as out-of-process job scripts.