Houdini 20.0 Executing tasks with PDG/TOPs

Introduction to TOPs

Explains the basic concepts behind TOP networks and what you can do with them.

On this page

Overview

You can specify work to be done and how to manipulate the results of that work by building a network of TOP nodes. TOP nodes generate work items to do work and/or store information in attributes. There are several types of TOP nodes, but the two main types are: processors that generate new work items and partitions that create dependencies between items so they wait until all items in the partition finish.

A TOP network describes a recipe for generating work items, efficiently running them in processes on a single machine or render farm, and figuring out dependencies so as many items can run in parallel as possible. This recipe is a PDG (Procedural Dependency Graph) that describes the work items to be completed and their dependencies.

By establishing the network of dependencies between all the bits of work needed to produce the final output, TOPs can find the more efficient way to “cook” the city recipe. It can also do the minimum amount of work necessary to “recook” the city when something changes, because it knows which parts depend on the changed information.

As part of Houdini, TOPs are especially geared toward accomplishing effects work, such as simulation, rendering, and compositing, however it can be useful for any kind of work that can be broken up into individual items with dependencies between them.

Working with TOPs

TOP node UI

For more information, see TOP network user interface.

Processors

Processors generate new work items. Processors can generate multiple new work items based on previous work items (“fan-out”).

A processor node can create new work items from scratch, based on the node’s parameters, or it can generate new work items from the work items incoming from the input node. Many nodes can work both ways. If the Wedge node has no input, it will create work items for the number of wedges specified in its parameters. If it has an input, it create new work items for the number of wedges, multiplied by the number of incoming work items.

When a processor node generates a new “child” work item based on a “parent” item, it passes down the parent’s attributes, so work items can pass results downstream.

Partitions

Partition nodes join together incoming work items together based on various criteria. This is how PDG combines multiple pieces of work into a single piece for further processing (“fan-in”).

Different partition nodes group together upstream work items in different ways.

Partitioning has two effects

  • It creates dependencies between the grouped work items, so the partition doesn’t continue processing until all the work items in the group are done.

  • It (optionally) merges the attribute values on the grouped work items. So, for example, any work items created from the partition will receive the aggregated list of output file paths from the grouped work items as their input attribute.

The most commonly used partition node is Wait for All, which groups all incoming work items into a single partition, causing the partition to wait for all the upstream work to finish before moving on.

You can partition based on attributes (for example, grouping all items that worked on the same frame), by a custom function, or spatially. For example, if you scattered trees over a terrain, you could divide a terrain into a grid and group all the items generating geometry in a certain tile together.

By themselves, the items inside partition nodes don’t do any work. They just hold the merged information of the grouped work items. Subsequent work items can then operate on the merged information.

Schedulers

Scheduler nodes actually run executables inside work items.

  • Different scheduler types represent different methods for running the executables. For example, the Local Scheduler runs executables using a process pool on the local machine, while the HQueue Scheduler puts the executables on an HQueue render farm.

  • You can have multiple scheduler nodes in a TOP network, allowing you to set up the network to cook in different ways depending on circumstances (for example, you might want to cook locally while testing changes and only cook on the farm as part of a nightly process). The TOP Network node’s Default TOP Scheduler parameter controls which scheduler is used to cook the network.

Simplified TOP network example

The simplified illustration on the right depicts a basic network to render out five frames of an animation at different quality levels, and create mosaics of the same frame at different qualities, so we can understand how the quality setting affects the final image.

  1. A Wedge node generates four “dummy” work items containing wedge variants (for example, different values for Mantra’s Pixel Samples parameter).

  2. Append a ROP Mantra Render node and set the frame range to 1-5. This generates 5 new work items for each “parent” item. The wedged attribute is passed down to the new work items.

  3. A Partition by Frame node creates new “partition” items with dependencies on the work items with the same frame number. This makes the partition wait until all variants of the same frame are done before proceeding to the next step.

  4. An ImageMagick node generates new work items to take the rendered variants of each frame and merge them together into a mosaic image.

    Of course, we could do more like overlay text on the image showing what settings correspond to each sub-image in the mosaic.

  5. A Wait for All node at the end generates a partition item that waits for all work to finish.

    This lets you add work items after the “wait for all” that only run after all the “main” work is done. This is where you would put, for example, post-render scripts and/or notifications to the user that the work is done.

See commonly used TOP nodes for more information.

Work items

Each processor node in the TOP network generates a certain number of work items. The work items created by a node a depicted inside the node body in the network editor (see network interface below for more information).

Many work items represent a “job”, that is, a script or executable to run, either in a process on a single computer or on a render farm. However, some work items exist simply to hold attributes, and don’t actually cause any work to be done.

For example, work items created by the HDA Processor node represent work: they cook a Houdini asset on an input file. However, items created by the File Pattern node are just holders for each matched file path. Work items in downstream processors are expected to do something with those file names.

Attributes

A work item contains attributes, named bits of information similar to attributes on a point in Houdini geometry. The script or executable associated with the work item can read the attributes to control the work they do. Attributes are passed down to work items created from “parent” work items, so you can use them to have the result of a work item affect the processing of its child items, group items together, and pass information down through the network.

Custom code can read attributes to control generation of work items, however the most common use of attributes is to reference them in parameters of TOP nodes and/or Houdini networks and nodes that are referenced by the TOP network.

  • For example, let’s say you want to wedge different render qualities. You can use the Wedge TOP to create work items that have a pixelsamples attribute set to different values. Then, in the ROP Mantra Render TOP, you can set Pixel samples using @pixelsamples to reference the attribute value.

  • You can also reference work item attributes in external assets/networks called by the TOP network. For example, the HDA Processor cooks a Houdini asset for each work item. You can use @attribute references in that asset’s parameters to pull values from the work item.

You can reference a component of a vector using @attribute.component, where component is a zero-based number, or x, y, or z (equivalent of 0, 1, 2). For example @pos.x or @pdg_output.0. You can reference all components of an attribute array as a space-separated string using pdgattribvals.

Nodes and custom code can add arbitrary attributes, however work items also have a few built-in attributes that are always accessible.

For more information about attributes, see Using Attributes.

In-process vs. out-of-process

Work items cook in either the current Houdini session (in-process) or in a separate process that gets created by a Scheduler node (out-of-process). This includes work items that cook on the farm like the HQueue Scheduler TOP and Tractor Scheduler TOP nodes which run out-of-process.

Work items that cook in-process have access to all of the data in the current Houdini session, but it is not possible for them to edit the scene in any way or cook other network contexts because they are being processed in the background. Work items that cook out-of-process however have some start up overhead and have access to only a limited about of information (just the saved work item attribute data).

Static vs. dynamic

Important Note

The static vs. dynamic terminology has mostly been replaced by the Generate When parameter (found on all the Processor nodes) terminology. This parameter configures when a node runs its internal logic to produce work items. For example, generate work items when All Upstream Items are Generated, All Upstream Items are Cooked, or when Each Upstream Item is Cooked. The parameter’s default option is Automatic, which tries do the correct thing for a node based on what PDG can understand about its graph.

Static

Some work items are known to be needed before the TOP network even starts running, just based on the node parameters. These are called static work items.

For example, a ROP Mantra TOP with no inputs generates a work item for each frame to render, and the number of frames to render is set in the parameters, so the number of work items is known ahead of time. And if the next node just generates a new work item for each incoming work item, then its work items are known “statically” as well.

Film workflows are often frame-based and so lend themselves to generating static work items, because the number of frames to render is known ahead of time.

Dynamic

Sometimes work items can only be generated once the work is being done. These are called dynamic work items.

Games and effects workflows are often more dynamic. For example, a secondary simulation on top of crowd agents might divide up the crowd into tiles based on how many agents are in each tile. This can only be done dynamically after generating the crowd simulation and reading in the files.

  • As a general rule of thumb, static is preferable to dynamic: because the number of work items is visible, you can get a sense of how much work will be done, and Houdini can give more accurate estimates of percent completed.

  • However, depending on the workflows, dynamic work items can be unavoidable, and in fact almost the entire network may be dynamic. For example, in a games workflow where you generate a level from layout curves, you don’t know how much work there is to do until you intersect the curves with other curves, intersect the curves with tiles, and so on.

  • There might be cases where Houdini can’t tell whether work is static, and gives you the choice between generating work items dynamically or statically. In these cases, it’s always safe to mark the work dynamic, since the TOP network can always figure out what work items are needed at runtime. However, if you know that the work items are static, you can mark the node as static to get the advantages of static items.

    (A Wait for All node node can convert a dynamic number of work items to a static number (one). Other partitions may also convert dynamic to static. These cases should be handled by the Automatic setting, without you needing to change node settings manually.)

    • Most processor nodes have a Generate When menu that lets you choose Static, Dynamic, or Automatic (Automatic means “if this node is capable of static generation, or it has no inputs, generate static”). Processors will lack this option if they can only work dynamically or only work statically.

    • Some partition nodes have a parameter called Use dynamic partitioning. Turn this on when the partitioning depends on information generated dynamically. Turn it off if the partitioning only depends on information that was already “known” in the closest upstream static work items, in which case the groupings can also be computed statically. (Some partition nodes may not provide this option if how the node works dictates dynamic or static partitioning.)

Tip

If you have a TOP network that has no cached results (no work items inside the nodes), you can choose Tasks ▸ Generate Static Work Items in the network editor to generate all statically-known items. This can often help give a sense of how much work is involved in the network.

Executing tasks with PDG/TOPs

Basics

Beginner Tutorials

Next steps

  • Running external programs

    How to wrap external functionality in a TOP node.

  • File tags

    Work items track the results created by their work. Each result is tagged with a type.

  • PDG Path Map

    The PDG Path Map manages the mapping of paths between file systems.

  • Feedback loops

    You can use for-each blocks to process looping, sequential chains of operations on work items.

  • Service Blocks

    Services blocks let you define a section of work items that should run using a shared Service process

  • PDG Services

    PDG services manages pools of persistent Houdini sessions that can be used to reduce work item cooking time.

  • Integrating PDG with render farm schedulers

    How to use different schedulers to schedule and execute work.

  • Visualizing work item performance

    How to visualize the relative cook times (or file output sizes) of work items in the network.

  • Event handling

    You can register a Python function to handle events from a PDG node or graph

  • Tips and tricks

    Useful general information and best practices for working with TOPs.

  • Troubleshooting PDG scheduler issues on the farm

    Useful information to help you troubleshoot scheduling PDG work items on the farm.

  • PilotPDG

    Standalone application or limited license for working with PDG-specific workflows.

Reference

  • All TOPs nodes

    TOP nodes define a workflow where data is fed into the network, turned into work items and manipulated by different nodes. Many nodes represent external processes that can be run on the local machine or a server farm.

  • Processor Node Callbacks

    Processor nodes generate work items that can be executed by a scheduler

  • Partitioner Node Callbacks

    Partitioner nodes group multiple upstream work items into single partitions.

  • Scheduler Node Callbacks

    Scheduler nodes execute work items

  • Custom File Tags and Handlers

    PDG uses file tags to determine the type of an output file.

  • Python API

    The classes and functions in the Python pdg package for working with dependency graphs.

  • Job API

    Python API used by job scripts.

  • Utility API

    The classes and functions in the Python pdgutils package are intended for use both in PDG nodes and scripts as well as out-of-process job scripts.