Houdini 20.5 Executing tasks with PDG/TOPs

Job API

Python API used by job scripts.

On this page

Overview
Basic Usage
- pdgcmd Functions
Methods

Overview ¶

Work items (jobs) which cook out-of-process will often want to send data or their statuses back to PDG. The PDG Scheduler can track when a job starts and if it succeeds or fails independent of the job script. However, if the job produces output files that contain information PDG should know about, then you can use this API to report that information back to PDG. In addition to outputting files, you can also use this API to write attributes into running work items.

The Job API has two main modules:

pdgjson.py provides a high-level API which matches pdg.WorkItem. You can use this module to read and write work item attributes and input/output files. This is the recommended API.
pdgcmd.py provides a lower-level API which is used by pdgjson.py. You can also use this module when necessary.

These python modules are automatically copied to $PDG_TEMP/scripts by the PDG Scheduler.

Basic Usage ¶

Python Module	Behavior
`pdgjson`	Provides a high-level `WorkItem` API similar to what is available in PDG. Useful top-level imports are pdg.WorkItem, pdg.File, pdg.attribType, pdg.attribFlag, pdg.workItemType, and pdg.workItemState. You can obtain the `WorkItem` object by calling `pdgjson.WorkItem.fromJobEnvironment()`. from pdgjson import WorkItem # create WorkItem object from the data serialized to `$PDG_TEMP/data/workitem.json` work_item = WorkItem.fromJobEnvironment() # read attrib values from the running work item val = work_item.attribValue('myintattrib') # Send data back to PDG work item via network calls work_item.setStringAttrib('runtime_attrib', 'test value', 0) work_item.addOutputFile('/tmp/myoutput.txt')
`pdgcmd`	Provides a low-level API to communicate with PDG via RPC function calls. from pdgcmd import addOutputFile # Send data back to PDG work item via network calls pdgcmd.setStringAttrib('runtime_attrib', 'test value', 0) pdgcmd.addOutputFile('/tmp/myoutput.txt')

Python Module

Behavior

pdgjson

Provides a high-level WorkItem API similar to what is available in PDG. Useful top-level imports are pdg.WorkItem, pdg.File, pdg.attribType, pdg.attribFlag, pdg.workItemType, and pdg.workItemState. You can obtain the WorkItem object by calling pdgjson.WorkItem.fromJobEnvironment().

from pdgjson import WorkItem

# create WorkItem object from the data serialized to `$PDG_TEMP/data/workitem.json`
work_item = WorkItem.fromJobEnvironment()

# read attrib values from the running work item
val = work_item.attribValue('myintattrib')

# Send data back to PDG work item via network calls
work_item.setStringAttrib('runtime_attrib', 'test value', 0)
work_item.addOutputFile('/tmp/myoutput.txt')

pdgcmd

Provides a low-level API to communicate with PDG via RPC function calls.

from pdgcmd import addOutputFile

# Send data back to PDG work item via network calls
pdgcmd.setStringAttrib('runtime_attrib', 'test value', 0)
pdgcmd.addOutputFile('/tmp/myoutput.txt')

pdgcmd Functions ¶

Methods ¶

addOutputFile(result_data, workitem_id=None, server_addr=None, result_data_tag="", subindex=-1, and_success=False, to_stdout=True, duration=0.0, hash_code=0)

If your work script generates a file (or files), then you can report the result(s) using this function. These files will then be added as Output files on the work item.

Note

You can write back attributes in a similar fashion using the other functions listed below.

result_data

An output filename or a list of output file paths.

workitem_id

The id of the work item that generated the result. If you omit this argument, then the function will look it up in the environment.

server_addr

A string containing the IP address of the result server to report to. If you omit this argument, then the function will look it up in the environment.

result_data_tag

The "type tag" string to use for the output file(s). For example, "file/geo".

and_success

When this is True, the function call sets the work item’s status to success as well as sets the output file.

duration

If you set and_success=True, then you can set this to the total runtime (float value in seconds) of the work script.

If you don’t set this, then PDG will automatically calculate the duration as the time between the workItemStartCook RPC and the job termination.

If your work script generates a file (or files), then you can report the result(s) using this function.

addOutputFiles(output_file_array, workitem_id=None, server_addr=None, output_file_tag="", subindex=-1, to_stdout=True, hash_codes=[])

Provides a version of addOutputFiles that writes a list of files in an array as a single operation, which can be more efficient for large output file lists.

output_file_array

An array of output file paths.

workitem_id

The id of the work item that generated the file. If you omit this argument, then the function will look it up in the environment.

server_addr

A string containing the IP address of the result server to report to. If you omit this argument, then the function will look it up in the environment.

output_file_tag

The "type tag" string or stribng array to use for the output file(s). For example, "file/geo" or ["file/geo/render", "file/geo/collision"].

Provides a version of addOutputFiles that writes a list of files in an array as a single operation, which can be more efficient for large output file lists.

reportResultData(result_data, workitem_id=None, server_addr=None, result_data_tag="", subindex=-1, and_success=False, to_stdout=True, duration=0.0, hash_code=0)

This method is deprecated. Use addOutputFile instead.

This method is deprecated.

delocalizePath(local_path) → str

When Path Mapping is disabled (set to None through the Scheduler UI), this function delocalizes the specified path to be rooted at PDG_DIR. Requires the presence of the PDG_DIR environment variable.

local_path

The local path to be delocalized.

When Path Mapping is disabled (set to None through the Scheduler UI), this function delocalizes the specified path to be rooted at PDG_DIR.

makeDirSafe(local_path) → str

Makes a directory in the specified path if one does not exist. This function prevents concurrent directory creation.

local_path

The directory path.

Makes a directory in the specified path if one does not exist.

localizePath(deloc_path) → str

Localizes the specified path. When Path Mapping is disabled, this function replaces any __PDG* tokens. When Path Mapping is enabled, this function maps the path to the local file system and expands the environment variables.

deloc_path

The path to be localized.

Localizes the specified path.

waitUntilReady(workitem_id, subindex, server_addr=None)

Blocks the job until a batch sub-item begins to cook. When a batch work item is set to start cooking When first frame is ready, it needs to poll PDG to determine when all the upstream dependencies for a given frame are complete so that it’s safe for the frame to start cooking.

workitem_id

The id of the batch work item.

subindex

The batch work item sub-index (the index of the work item within the batch).

server_addr

(Optional) The PDG Result Server address in the form 'host:port'.

Blocks the job until a batch sub-item begins to cook.

getWorkItemJSON(workitem_id, subindex, server_addr=None) → str

Returns a string containing the serialized JSON for the specified work item. This is used to get the full work item data for the specified batch sub-item just-in-time so that it can be set as the active work item within Houdini before starting to cook the frame.

workitem_id

The id of the batch work item.

subindex

The batch work item sub-index (the index of the work item within the batch).

server_addr

(Optional) The PDG Result Server address in the form 'host:port'.

Returns a string containing the serialized JSON for the specified work item.

workItemSuccess(workitem_id, subindex=-1, server_addr=None, to_stdout=True)

Reports that the specified work item has succeeded.

Note

Normally work items do not need to use this function because the schedulers will determine failed/success state by looking at the return code of the process.

workitem_id

The name of the work item.

subindex

The batch work item sub-index. -1 indicates that the work item is not a batch sub-item.

server_addr

(Optional) The PDG Result Server address in the form 'host:port'.

to_stdout

When this is True, the information is printed to stdout and the RPC is performed.

Reports that the specified work item has succeeded.

workItemFailed(workitem_id, subindex=-1, server_addr=None, to_stdout=True)

Reports that the specified work has failed.

Note

Normally work items do not need to use this function because the schedulers will determine failed/success state by looking at the return code of the process.

workitem_id

The name of the work item.

subindex

The batch work item sub-index. -1 indicates that the work item is not a batch sub-item.

server_addr

(Optional) The PDG Result Server address in the form 'host:port'.

to_stdout

When this is True, the information is printed to stdout and the RPC is performed.

Reports that the specified work has failed.

workItemStartCook(workitem_id, subindex=-1, server_addr=None, to_stdout=True)

Reports that the specified work item has started cooking.

Note

Normally work items do not need to use this because the Local Scheduler will already know that the item has started cooking, and other farm schedulers will generally have a wrapper script that will make this call before invoking the work item.

workitem_id

The name of the work item.

subindex

The batch work item sub-index. -1 indicates that the work item is not a batch sub-item.

server_addr

(Optional) The PDG Result Server address in the form 'host:port'.

to_stdout

When this is True, the information is printed to stdout and the RPC is performed.

Reports that the specified work item has started cooking.

workItemAppendLog(log_data, log_type=3, workitem_id, subindex=-1, server_addr=None, to_stdout=True)

Appends the specified log string to the work item’s internal log buffer.

log_data

Text data to append to the end of the work item’s log

log_type

Log mesasge type. Defaults to pdg.workItemLogType.Raw.

workitem_id

The name of the work item.

subindex

The batch work item sub-index. -1 indicates that the work item is not a batch sub-item.

server_addr

(Optional) The PDG Result Server address in the form 'host:port'.

to_stdout

When this is True, the information is printed to stdout and the RPC is performed.

Appends the specified log string to the work item’s internal log buffer.

workItemSetCustomState(custom_state, workitem_id, subindex=-1, server_addr=None, to_stdout=True)

Sets the custom state string of the specified work item.

custom_state

The string to set as the work item’s custom state

workitem_id

The name of the work item.

subindex

The batch work item sub-index. -1 indicates that the work item is not a batch sub-item.

server_addr

(Optional) The PDG Result Server address in the form 'host:port'.

to_stdout

When this is True, the information is printed to stdout and the RPC is performed.

Sets the custom state string of the specified work item.

setStringAttribArray(attr_name, attr_value, workitem_id=None, subindex=-1, server_addr=None)

Writes attribute data back into a work item in PDG via the callback server.

attr_name

The name of the attribute to create or update.

attr_value

An array of strings.

workitem_id

(Optional) The name of the work item.

subindex

The batch work item sub-index. -1 indicates that the work item is not a batch sub-item.

server_addr

(Optional) The PDG Result Server address in the form 'host:port'.

Writes attribute data back into a work item in PDG via the callback server.

setIntAttribArray(attr_name, attr_value, workitem_id=None, subindex=-1, server_addr=None)

Writes attribute data back into a work item in PDG via the callback server.

attr_name

The name of the attribute to create or update.

attr_value

An array of numeric values.

workitem_id

(Optional) The name of the work item.

subindex

The batch work item sub-index. -1 indicates that the work item is not a batch sub-item.

server_addr

(Optional) The PDG Result Server address in the form 'host:port'.

Writes attribute data back into a work item in PDG via the callback server.

setFloatAttribArray(attr_name, attr_value, workitem_id=None, subindex=-1, server_addr=None)

Writes attribute data back into a work item in PDG via the callback server.

attr_name

The name of the attribute to create or update.

attr_value

An array of numeric values.

workitem_id

(Optional) The name of the work item.

subindex

The batch work item sub-index. -1 indicates that the work item is not a batch sub-item.

server_addr

(Optional) The PDG Result Server address in the form 'host:port'.

Writes attribute data back into a work item in PDG via the callback server.

setFileAttribArray(attr_name, attr_value, workitem_id=None, subindex=-1, server_addr=None)

Writes attribute data back into a work item in PDG via the callback server.

attr_name

The name of the attribute to create or update.

attr_value

An array of pdgjson.File objects.

workitem_id

(Optional) The name of the work item.

subindex

The batch work item sub-index. -1 indicates that the work item is not a batch sub-item.

server_addr

(Optional) The PDG Result Server address in the form 'host:port'.

Writes attribute data back into a work item in PDG via the callback server.

setPyObjectAttrib(attr_name, attr_value, workitem_id=None, subindex=-1, server_addr=None)

Writes attribute data back into a work item in PDG via the callback server. The attr_value needs to be a serialized Python Object, for example the string repr(..) of the object that can be stored to a string. If a custom serialization format is used, it needs to match the module defined using the PDG_PYATTRIB_LOADER variable or through the pdg.TypeRegistry.pySerializationModule API call so that the Houdini session can deserialize the attribute data.

attr_name

The name of the attribute to create or update.

attr_value

A string-serialized python object. By default, it should be a repr() which can be unserialized using eval(). However, you can use a custom format if you set [pdg.TypeRegistry#pySerializationModule].

workitem_id

(Optional) The name of the work item.

subindex

The batch work item sub-index. -1 indicates that the work item is not a batch sub-item.

server_addr

(Optional) The PDG Result Server address in the form 'host:port'.

Writes attribute data back into a work item in PDG via the callback server.

setStringAttrib(attr_name, attr_value, attr_index, workitem_id=None, subindex=-1, server_addr=None)

Writes attribute data back into a work item in PDG via the callback server.

attr_name

The name of the attribute to create or update.

attr_value

A string value.

attr_index

The index of the value within the attribute’s array of values.

workitem_id

(Optional) The id of the item.

subindex

The batch work item sub-index. -1 indicates that the work item is not a batch sub-item.

server_addr

(Optional) The PDG Result Server address in the form 'host:port'.

Writes attribute data back into a work item in PDG via the callback server.

setIntAttrib(attr_name, attr_value, attr_index, workitem_id=None, subindex=-1, server_addr=None)

Writes attribute data back into a work item in PDG via the callback server.

attr_name

The name of the attribute to create or update.

attr_value

A numeric value.

attr_index

The index of the value within the attribute’s array of values.

workitem_id

(Optional) The id of the item.

subindex

The batch work item sub-index. -1 indicates that the work item is not a batch sub-item.

server_addr

(Optional) The PDG Result Server address in the form 'host:port'.

Writes attribute data back into a work item in PDG via the callback server.

setFloatAttrib(attr_name, attr_value, attr_index, workitem_id=None, subindex=-1, server_addr=None)

Writes attribute data back into a work item in PDG via the callback server.

attr_name

The name of the attribute to create or update.

attr_value

A numeric value.

attr_index

The index of the value within the attribute’s array of values.

workitem_id

(Optional) The id of the item.

subindex

The batch work item sub-index. -1 indicates that the work item is not a batch sub-item.

server_addr

(Optional) The PDG Result Server address in the form 'host:port'.

Writes attribute data back into a work item in PDG via the callback server.

setFileAttrib(attr_name, attr_value, attr_index, workitem_id=None, subindex=-1, server_addr=None)

Writes attribute data back into a work item in PDG via the callback server.

attr_name

The name of the attribute to create or update.

attr_value

A pdgjson.File object.

attr_index

The index of the value within the attribute’s array of values.

workitem_id

(Optional) The id of the item.

subindex

The batch work item sub-index. -1 indicates that the work item is not a batch sub-item.

server_addr

(Optional) The PDG Result Server address in the form 'host:port'.

Writes attribute data back into a work item in PDG via the callback server.

invalidateCache(workitem_id=None, subindex=-1, server_addr=None)

Requests that the cache of the work item be invalidated by PDG. This forces downstream tasks to cook. The same effect can be achieved by adding an output file to the work item, however this method can be used to invalidate caches without explicitly adding a file.

workitem_id

(Optional) The id of the item.

subindex

The batch work item sub-index. -1 indicates that the work item is not a batch sub-item.

server_addr

(Optional) The PDG Result Server address in the form 'host:port'.

Requests that the cache of the work item be invalidated by PDG.

warning(message, workitem_id=None, server_addr=None)

Attaches a warning message to the specified work item’s node.

message

The string message to display.

workitem_id

(Optional) The name of the work item.

server_addr

(Optional) The PDG Result Server address in the form 'host:port'.

Attaches a warning message to the specified work item’s node.

reportServerStarted(servername, pid, host, port, proto_type, log_fname, workitem_id=None, server_addr=None)

This method is deprecated – shared servers have been replaced by Service Block.

Reports that a shared server has been started. This is used by work items in a Block Begin node that has been configured to use shared servers instead of services.

servername

The name of the server.

pid

The numeric process ID (PID) of the server process.

host

The hostname of the machine the server is running on.

port

The numeric port number at which the server is listening.

proto_type

The protocol associated with this server.

log_fname

The path to the log file that this server is using.

workitem_id

(Optional) The id of the item.

server_addr

(Optional) The PDG Result Server address in the form 'host:port'.

This method is deprecated – shared servers have been replaced by Service Block.

Job API

Overview ¶

Basic Usage ¶

pdgcmd Functions ¶

Methods ¶

Executing tasks with PDG/TOPs

Basics ¶

Beginner Tutorials ¶

Next steps ¶

Reference ¶