Houdini 18.0 Executing Tasks

Scheduler Node Callbacks

Scheduler nodes execute work items

On this page

Overview

Schedulers are one of the main types of node in a PDG graph. The purpose of a scheduler node is to execute ready work items that are submitted to the scheduler node by PDG.

In addition the scheduler must report status changes and ensure that the jobs are able to communicate back to PDG when necessary. By default the jobs will use the supporting python module pdgjson to do this communication via XMLRPC, and the scheduler is responsible for ensuring that the XMLRPC server is running. Note that this mechanism could be replaced by a custom scheduler if required.

Scheduler Callbacks

Each scheduler node has several callbacks that can be implemented to control how it operates. When writing a scheduler node the only callback you're required to implement is onSchedule, since that hook is responsible for actually executing the ready work items. If the work items are marked as begin in process, they will not reach the scheduler and will instead be handled by PDG’s internal scheduler. The other callbacks are optional and are only needed to further customize the behavior of the node.

Warning

The only callback that can safely write work item attributes is onSchedule. If you want to add attributes to a work item in the onTick callback, you need to use the pdg.WorkItem.lockAttributes in order to safely modify the work item.

Additionally, your scheduler node should only keep references to work items that are actively running. Once your scheduler notifies PDG that a work item has succeeded or failed, it should no longer hold a reference to that work item.

The full scheduler node API is described in pdg.Scheduler.

applicationBin(self, name, work_item)str

This callback is used when a node is creating a command that uses an application that can be parameterized by the scheduler. For example there may be UI to control which 'python' application should be used for python-based jobs.

Custom scheduler bindings can use their own application 'names' to work with custom nodes.

At minimum 'hython' and 'python' should be supported.

onSchedule(self, work_item)pdg.scheduleResult

This callback is evaluated when the given pdg.WorkItem is ready to be executed. The scheduler should create the necessary job spec for their farm scheduler and submit it if possible. If it doesn’t have enough resources to execute the work item, it should return Deferred or FullDeferred, which tells PDG that the scheduler can’t accomodate the work item, and it should check back later.

Otherwise it should return Succeeded to indicate that the work item has been accepted.

The other return values are used when a work item for some reason is handled immediately. This is not generally recommended because it will force work items to execute in series.

For example, Local Scheduler will return FullDeferred if it determines that all available 'slots' on the local machine are in use. On the other hand it will return Deferred if there are slots available but not enough for this particular work item. If there are enough slots, it will deduct the slots required, spawn a subprocess for the work item, and then add the work item to a private queue of running items to be tracked.

Note that the frequency that this callback is called is controlled by the pdg node parameter pdg_maxitems and pdg_tickperiod (See onTick below).

onTick(self)pdg.tickResult

This callback is called periodically when the graph is cooking. The callback is generally used to check the state of running work items. This is also the only safe place to cancel an ongoing cook.

The period of this callback is controlled with the PDG node parameter pdg_tickperiod, and the maximum number of ready item onSchedule callbacks between ticks is controlled by the node parameter pdg_maxitems. For example by default the tick period is 0.5s and the max items per tick is 30. This means that onSchedule will be called a maximum of 60 times per second. Adjusting these values can be useful to control the load on the farm scheduler.

The callback should return SchedulerReady if the scheduler is ready to accept new work items, and should return SchedulerBusy if it’s full at the moment. In case there is a serious problem with the scheduler (for example the connection to the farm is lost), it should return SchedulerCancelCook.

onStartCook(self, static, cook_set)bool

This callback is called when a PDG cook starts, after static generation.

static is True when a static cook is being performed instead of a full cook. See onScheduleStatic for details.

cook_set is the set of PDG pdg.Node being cooked.

This can be used to initialize any resources or cache any values that apply to the overall cook. Returning False or raising an exception will abort the cook. You should tell PDG what the user’s working directory is by calling:

self.setWorkingDir(local_path, remote_path)

onStopCook(self, cancel)bool

Called when cooking completes or is canceled. If cancel is True there will likely be jobs still running. In that case the scheduler should cancel them and block until they are actually canceled. This is also the time to tear down any resources that are set up in onStartCook.

onStart(self)bool

Called by PDG when scheduler is first created. Can be used to acquire resources that persist between cooks.

onStop(self)bool

Called by PDG when scheduler is cleaned up. Can be used to release resources. Note that this method may not be called in some cases when Houdini is shut down.

endSharedServer(self, sharedserver_name)bool

Called when a shared server should be terminated. For example the Houdini Command Chain will generate endserver work items which will evaluate this callback when the command chain has ended and the associated Houdini server should be closed. Typically the scheduler can use the shutdownServer function in the pdgjob.sharedserver module to issue the shutdown command via XMLRPC. See command servers for additional details on the use of command chains.

getStatusURI(self, work_item)str

Called to return the status URI for the specified work item. This appears in the MMB detail window of a work item. It can be formatted to point to a local file with file:/// or a web page with 'http://'.

getLogURI(self, work_item)str

Returns the log URI for the specified work item. This appears in the MMB detail window of a work item, and is also available with the special @pdg_log attribute. It can be formatted to point to a local file with file:/// or a web page with 'http://'.

workItemResultServerAddr(self)str

Returns the network endpoint for the work item result server, in the format <HOST>:<PORT>, this is equivalent to the __PDG_RESULT_SERVER__ command token, and the job environment variable $PDG_RESULT_SERVER. This will typically be an XMLRPC API server.

onScheduleStatic(self, dependency_map, dependent_map, ready_items)None

Called to do a static cook of the graph, which is a cook mode of StaticDepsFull or StaticDepsNode. Typically this function will build a complete job spec and submit this to the farm scheduler. How this is done depends on your farm scheduler API. For example the dependencies between work items may have to be translated into parent/child relationships in the job spec so that the work is executed in the correct order.

Note

This functionality is only needed if complete static cooks are required. In order to show status changes in the TOP graph, the implementation will have to provide a callback server so that jobs can report results and status changes. As well it will have to ensure that all work items are serialized such that their JSON representation is available to the job scripts when executed. In addition, not all TOP nodes support this mode of cooking by default, and may require some customization to work with your farm scheduler. For example ROP Fetch and other ROP-based nodes will poll the callback server if ROP Fetch cookwhen is not set to All Frames are Ready when batched.

dependency_map is a map of pdg.WorkItem to a set of it’s dependency work items.

dependent_map is a map of pdg.WorkItem to a set of it’s dependent work items.

ready_items is a list of pdg.WorkItem that are ready to be executed.

Note that this information can be obtained via pdg.Graph.dependencyGraph

import pdg
n = hou.node("/obj/topnet1/out")
# Call executeGraph to ensure PDG context is created
n.executeGraph(True, True, False, True)
# Perform generation phase of PDG cook
n.getPDGGraphContext().cook(True, pdg.cookType.StaticDepsFull)
# Retrieve the generated task graph work items and topology 
(dependencies, dependents, ready) = n.getPDGGraphContext().graph.dependencyGraph(True)

Note

This mode of cooking is not exposed in the TOP UI, and is not supported by the stock schedulers. Although Local Scheduler has a basic implementation for demonstration purposes. To trigger this mode of cooking you can call pdg.GraphContext.cook with mode of StaticDepsFull or StaticDepsNode).

The implementation should save the required data and return immediately from this function. Then it should asynchronously manage the execution of the graph and report back all state changes via the scheduler node functions onWorkItemSucceeded, onWorkItemFailed or onWorkItemCanceled. In addition, it should ensure that all data changes done by jobs are reported back to PDG, for example by calling onWorkItemFileResult.

Once all work items have been reported back to PDG as finished the static cook will end.

See also

Executing Tasks

Basics

Next steps

Reference

  • All TOPs nodes

    TOP nodes define a workflow where data is fed into the network, turned into "work items" and manipulated by different nodes. Many nodes represent external processes that can be run on the local machine or a server farm.

  • Processor Node Callbacks

    Processor nodes generate work items that can be executed by a scheduler

  • Partitioner Node Callbacks

    Partitioner nodes group multiple upstream work items into single partitions.

  • Scheduler Node Callbacks

    Scheduler nodes execute work items

  • Custom File Tags and Cache Handlers

    PDG uses file tags to determine the type of an output file.

  • Python API

    The classes and functions in the Python pdg package for working with dependency graphs.