Machine learning consists often of two distinct stages. Training is the process of building a model that can solve a problem. Inference is using the model to actually solve the problem.
This node performs inference. It uses a pre-trained model on the inputs to generate the machine-learned solution.
There are many systems to do training and these often require significant dependencies. However, often the resulting model consists of only a few simple operations. The ONNX format is a file format and inference engine that allows one to use models trained in PyTorch or TensorFlow in a common framework.
An ONNX inference model consists of a number of inputs and outputs. Each input and output is a tensor, otherwise known as multi-dimensional data. For dimension 1 this is an array, dimension 2 an image, dimension 3 a volume, and higher dimensions can be generalized from there.
Note
ONNX models can contain inputs and outputs of types other than tensor, but currently this node only supports tensors.
Parameters ¶
Model ¶
Model File
The name of a .onnx
file to load. The inputs and outputs must match the settings of the ONNX file.
Reload Model
Force a reload of the .onnx
file.
Setup Shapes from Model
This callback reads the .onnx
file to determine the number of inputs and outputs and their shapes.
This populates the input and output configuration parameters, such as Input Tensor Shape, and adds the required inputs and outputs to the node.
Dynamic axes are assigned a shape size of -1.
Number of Inputs ¶
Name
The semantic name of the input. Actual wiring to the ONNX model is done by index, not by name, but this can be used to label them. The Setup Shapes from Model button will populate this with the input name from the model.
Data
Selects the node input(s) which provide the data for this model input. This can be a space-separated list of node input names to link multiple inputs together in a chain, such as separate R, G, and B layers.
Input Tensor Shape
The shape of the input. -1
is used for dynamic axis. The first 0
axis terminates the shape. The total shape must match the number of elements in the input.
Tensor Order
Specifies how the image is converted to a 1D tensor.
YX
Unpacks pixels horizontally (neighboring samples along X are placed together).
XY
Unpacks pixels vertically (neighboring samples along Y are placed together).
Collate Channels Separately
When converting an image with multiple channels per pixel into a 1D tensor, the channel value can either be interleaved or placed one after another.
With this option off, an RGB image would be stored as rgbrgbrgb
and the shape’s final dimension should be the number of channels.
With this option on, an RGB image would be stored as rrrgggbbb
and the shape’s first dimension should be the number of channels.
Channel Size
Overrides the number of channels for the image.
Number of Outputs ¶
Name
The semantic name of the output. Actual wiring to the ONNX model is done by index, not by name, but this can be used to label them.
This name is matched by the output Data parameter to determine which model output is used for the node’s output.Output Tensor Shape
The shape of the input. -1
is used for dynamic axis. The first 0
axis terminates the shape. The total shape must match the number
of elements generated by the model.
Tensor Order
Specifies how the image is converted to a 1D tensor.
YX
Unpacks pixels horizontally (neighboring samples along X are placed together).
XY
Unpacks pixels vertically (neighboring samples along Y are placed together).
Collate Channels Separately
When converting an image with multiple channels per pixel into a 1D tensor, the channel value can either be interleaved or placed one after another.
With this option off, an RGB image would be stored as rgbrgbrgb
and the shape’s final dimension should be the number of channels.
With this option on, an RGB image would be stored as rrrgggbbb
and the shape’s first dimension should be the number of channels.
Channel Size
Overrides the number of channels for the image.
Max Batch Size
Usually models are configured to have the first dimension be dynamic and control batches. This is often done to improve training, but in many cases, such as point-based models, it is also very useful for inference.
Dynamic axes are usually expanded to the size of the input data and run in a single pass. This can require a lot of memory, however, so this option allows limiting the number of elements that can be processed at once, instead the model will be re-run on successive batches.
Note
Only the first dimension eligible for batching is handled in this manner.
Enable CUDA
Enables performing inference on the GPU (requires CUDA and cuDNN).
Input & Output ¶
Number of Inputs ¶
Specifies the number of inputs for the node.
Name
The name of the node input. This is used by the model input’s Data parameter to select node inputs.
Type
The expected data type for the input.
Tensor Order
Specifies how the image is converted to a 1D tensor. This option only takes effect when multiple inputs are being combined together using the Data string pattern.
YX
Unpacks pixels horizontally (neighboring samples along X are placed together).
XY
Unpacks pixels vertically (neighboring samples along Y are placed together).
Flip Input Vertically
Flips the input image vertically. This can be used for models which expect the origin to be at the top left corner of the image, rather than at the bottom left corner.
Collate Channels Separately
When converting an image with multiple channels per pixel into a 1D tensor, the channel value can either be interleaved or placed one after another.
With this option off, an RGB image would be stored as rgbrgbrgb
and the shape’s final dimension should be the number of channels.
With this option on, an RGB image would be stored as rrrgggbbb
and the shape’s first dimension should be the number of channels.
This option only takes effect when multiple inputs are being combined together using the Data string pattern.
Resample Size
Resamples the input image to the specified dimensions. This can be used to match the model’s expected size for an input image. A value of -1 indicates that the dimension will be deduced by maintaining the input aspect ratio.
Filter
Specifies the filter to use when resampling the input image.
Brightness Multiplier
Adjusts the brightness of the input image. This can be useful for models which expect colors to be represented by integers between 0 and 255, rather than floating-point values between 0 and 1. Such models require scaling up the input image’s brightness, and then scaling down the output image’s brightness.
Deduce Output Shapes from Data String
Populates the Output Tensor Shape for each output based on its Data pattern.
Number of Outputs ¶
Specifies the number of outputs for the node.
Name
The name of the node output.
Type
The expected data type for the node output. When set to Infer, the data type is determined by the Channel Size.
Data
Selects the model output(s) which should be translated into this node output. This can be a space-separated list of model output names to combine multiple outputs together.
Auto Deduce Output Shape
Controls whether Deduce Output Shapes from Data String will affect this output.
Output Tensor Shape
The shape of the input. -1
is used for dynamic axis. The first 0
axis terminates the shape. The total shape must match the number
of elements generated by the model.
Tensor Order
Specifies how the image is converted to a 1D tensor.
YX
Unpacks pixels horizontally (neighboring samples along X are placed together).
XY
Unpacks pixels vertically (neighboring samples along Y are placed together).
Flip Output Vertically
Flips the input image vertically. This can be used for models which expect the origin to be at the top left corner of the image, rather than at the bottom left corner.
Collate Channels Separately
When converting an image with multiple channels per pixel into a 1D tensor, the channel value can either be interleaved or placed one after another.
With this option off, an RGB image would be stored as rgbrgbrgb
and the shape’s final dimension should be the number of channels.
With this option on, an RGB image would be stored as rrrgggbbb
and the shape’s first dimension should be the number of channels.
Channel Size
Overrides the number of channels for the image.
Brightness Multiplier
Adjusts the brightness of the output image. This can be used to invert any brightness scaling that was applied to the input image.