ML Train Style Transfer TOP node

Trains a neural network for doing style transfer between image pairs

On this page	Overview Paired Training Unpaired Training Path Variables Parameters Data Set Training Checkpointing
Since	21.0

Overview ¶

This is a generic ML training node for creating models that generate output image from input images. For example, you can train a model to convert terrain topology sketches into height fields, or create a model that colorizes images, given existing samples of black/white images and their colored counterparts.

The node can train two different kinds of models – a paired model where there’s a pre-defined relationship between the input and output images, and an unpaired model where the inputs and outputs are unrelated. In both cases the node trains a Generative Adversarial Network (GAN) that attempts to learn the relationship between the images.

A GAN consists of two neural networks: the __generator and disriminator. The generator learns how to produce output images from input images. The discriminator, learns to distinguish a real output from a fake one created by the generator. As the model trains, the generator is optimized to do a better job of tricking the discriminator into accepting a generated output image as real. Conversely, the discriminator is optimized for accurately determining that a generated output is fake. Over many training, iterations both the generator and discriminator improve at their respective tasks. After the training process is finished, the final version of the generator function is the trained model.

The generator network created by this node is a convolutional neural network called a U-Net. The parameters of the U-Net, such as the number of layers and the convolution kernel size, can be configured using parameters on the node. The discriminator network is a single chain of convolution layers, which can also be configured using node parameters.

The node can be configured to write out an ONNX model file during the training process, which can be used with either the ONXX Interface SOP or the ONNX Inference COP to apply to the model to data in Houdini.

During the training process, it can be useful to periodically test the model to assess if the training process is working properly. This can be done by providing a second, smaller, test data set that consists of images that aren’t included in the original training data set. When testing the model, the node records the error between the correct test output and the generated output produced by running the model on the test inputs.

Paired Training ¶

Paired training involves providing a set of sample input images and their corresponding outputs. The image samples provide the training process with known, correct images as the basis for the training process. The training samples can be specified as two separate directories of images or as a directory of composite images with the input and output side-by-side in the same image. When specifying the input/output training images as two separate directories, the node expects the names and order of the files in both directories is the same.

An example of a pair of training samples images, used to train a model that converts terrain outlines into height fields:

“”“An example of a pair of training samples to train a model that converts terrain outlines into height fields”“”

“”“An alternative example of a composite image training sample for training the terrain generation model, where the entire training sample is saved a single image. The left half of the image is the input, the right half is the output.”“”

Unpaired Training ¶

Unpaired training follows the same general principles as paired training. However, it doesn’t have the requirement of an existing known relationship between input and output images. Instead of training a single generator, the node trains a network that maps input → output and a second inverse network that maps output → input. Unpaired training has less constraints and relies on the model’s ability to find transferable structure between the two data sets, which means the quality and consistency of the trained output is typically lower than paired training.

Unpaired training can only be used with input images that are specified in two separate directories.

Path Variables ¶

The node supports writing various files to disk during the training process, such as model checkpoints or ONNX files for inference in SOPs/COPs. The file paths are specified using parameters, which can include variables that are filled in just-in-time when the file path is used:

network

The name of the network component such asgenerator or discriminator.

index

The network component number. For paired training the model has only one generator and discriminator component, so the index will always be 0. For unpaired training there are two distinct generator and discriminator components, so the index will be either 0 or 1.

iteration

The training iteration number as an integer.

For example, you can set the ONNX Path parameter to $HIP/ml/models/model.{iteration}.onnx to create an ONNX file with the iteration number in its file path name.

Parameters ¶

Data Set ¶

Input Data Source

These parameters define the data set to the train the model. The data set should consist of one or more pairs of images that define sample inputs and outputs from the model. For the Unpaired Images input type, the images are effectively inputs without any sample outputs.

Input Type

Determines how input training images should be specified.

Single Paired Image:

Multiple Paired Images

Trains the model from multiple input/reference training samples, specified in two directories.

Multiple Unpaired Images

Trains the model using uncorrelated images without a clearly defined relationship. This is usually slower to train and produces less stable results. It also uses a different neural network architecture that paired training. The input images are specified in two directories.

Composite Paired Images

Trains the model from multiple input/reference training samples expressed as composite image pairs in a single directory.

Input Image

When Input Type is set to Single Paired Image, this file path determines the path to the sample input image used for training.

Reference Image

When Input Type is set to Single Paired Image, this file path determines the path to the sample reference or output image used for training.

Input Image Dir

When Input Type is set to Multiple Paired Images or Multiple Unpaired Images, this parameter determines the path to the directory of sample input images used for training.

Reference Image Dir

When Input Type is set to Multiple Paired Images or Multiple Unpaired Images, this parameter determines the path to the directory of sample reference or output images used for training.

Composite Image Dir

When Input Type is set to Composite Paired Images, this parameter determines the path to the directory of composite images that define an input/reference training pair.

Channel Count

The number of color channels in the input and reference images. This also defines the number of channels in results produced by the model. If any input or reference images have too few channels, their last channel is copied to fill the missing channels. Likewise, any channels beyond the channel count are discarded.

Image Size

The width and height of the input and reference images. Also the size of the images produced by the model. Currently the model can only be trained on square images.

Input Data Set Storage

Specifies how the input data set should be stored in memory.

Automatic

PyTorch will manage the storage and transfer of the data set between the CPU and the device used for training.

Keep on Training Device

Makes sure that the training data set is kept on the device selected for training.

Worker Threads

Specifies the number of background worker threads to use when loading input images. Using more threads can improve performance for large data sets. A value of 0 indicates the images should be loaded on the main training thread instead of in the background.

Input Batch Size

The number of training samples to load for each training batch. The model weights are updated after every batch, so training with a larger batch size can speed up training, but also requires more memory.

Input Randomization

These parameters can further randomize the input images, adding more variations that may help produce a higher quality model. For paired images, the same transformations are applied to both the input and reference images in a training sample.

Limit Input Count

When on, artificially limits the number of training samples the model uses. For example, if the input image directory has 1,000 images and this parameter is set to 500, the training process will only use the first 500 images.

Crop Inputs

When on, training samples are randomly cropped and resized back to the Image Size before being used in the training process. The value determines how much cropping should occur. For example, a value of 0.5 means that samples will be cropped to a random size between their original dimensions and half their original dimensions.

Horizontal Flip

When on, training samples are randomly flipped on their horizontal axis. The value is the probability that a given training sample is flipped.

Vertical Flip

When on, training samples are randomly flipped on their vertical axis. The value is the probability that a given training sample is flipped.

Rotation

When on, training samples are randomly rotated in both the clockwise and counterclockwise directions. Images are re-scaled so they fill the target Image Size after being rotated. This parameter specifies the probability of a random rotation occurring.

Shuffle Input Order

When on, training samples are randomly shuffled each training iteration.

Training ¶

Learning

These parameters control the basic behavior of model training, such as the rate the model learns and the total number of iterations spent training.

Maximum Iterations

The maximum number of training iterations or epochs.

Learning Rate

The rate the model updates itself during the training process. A higher value can cause the model to train faster, but may also result in instability that causes the model a failure to converge.

Rate Scheduler

Determines the scheduler type, which varies the learning rate over the training process.

Cosine Annealing

Uses a cosine annealing schedule, as described in the pytorch documentation

Linear

Maintains a constant learning rate as defined in the Learning Rate parameter, and then linearly decays it towards a value of 0 near the end of the training process. The Linear Decay parameter determines the number of training iterations to reduce the learning rate value to 0.

Step

Maintains a constant learning rate which is multiple by a factor of Gamma after every Step training iterations.

Linear Decay

When Rate Scheduler is set to Linear, determines how many iterations should be spent decaying the learning rate toward 0. For example, if set to 50 then learning rate will decay towards zero for the last 50 training iterations.

Step Size

When Rate Scheduler is set to Step, determines the number of training iterations between updates to the learning rate.

Gamma

When Rate Scheduler is set to Step, determines the scale factor applied to learning rate each time a scheduler step is reached.

Initializer Type

Determines which type of weight initializer when setting initial model weights.

Normal

Initializes the model weights with normally distributed random values.

Orthogonal

Initializes the model weights to an orthogonal matrix.

Gain

The scale factor applied to randomly initialized model weights.

Optimizer

These parameters determine which optimizes the model should use to update itself each training iteration.

Algorithm

Determines which optimization algorithm to use during the training process.

Adam

Uses the Adam algorithm, see pytorch documentation for more details.

Adadelta

Uses the ADADELTA algorithm, see pytorch documentation for more details.

SGD

Uses the Stochastic Gradient Descent algorithm, see pytorch documentation for more details.

ASGD

Uses the Averaged Stochastic Gradient Descent algorithm, see pytorch documentation for more details.

Beta

When Algorithm is set to Adam, determines the beta values used by the optimization algorithm. Higher beta values can result in faster model convergence, but may also decrease stability.

Rho

When Algorithm is set to Adadelta, determines the rho value used by the optimization algorithm. A higher value results in a faster rate of training but maybe also decrease stability.

Momentum

When Algorithm is set to SGD, determines the momentum value. This value is optional, but helps the training process converge faster by allowing past gradient calculations to influence the optimization process. The momentum value determines how heavily past results influence the optimization process.

Generator

These parameters define the topology of the Generator network in the underlying GAN model used by this node. The Generator network is a U-Net, which down-samples and then up-samples input images, with skip connections at each layer.

Layers

The number of down-sample and up-sample layers in the generator.

This parameter is directly influenced by the Image Size parameter. For example, if the image size is set to 256 and this parameter is set to 8, the generator will have 8 layers with the sizes: 64, 128, 256, 256, 256, 256, 256, 256.

Initial Layer Size

The size of the first generator layer. The size of subsequent layers is determined based on this value, as described in the Layer parameter.

Kernel Size

The size of the convolution kernels used in the generator.

L1 Lambda

The scale factor applied to the error calculation when training the generator network.

For paired images, the error value is a measure of the difference between a reference image and an image generated by the model using the same training input.

For unpaired images, the error value is a measure of the difference between a reference image and an image generated by applying the output of one generator back into the second generator.

Down Activation

The activation function to use for down-sampling layers.

Up Activation

The activation function to use for up-sampling layers.

Discriminator

These parameters define the topology of the Discriminator network in the underlying GAN model.

Layers

The number of layers in the discriminator.

Initial Layer Size

The size of the first discriminator layer. The size of subsequent layers is determined based on this value, as described in the Layer parameter.

Kernel Size

The size of the convolution kernels used in the discriminator.

Activation

The activation function to use for discriminator layers.

Testing

Test Model

When on, the trained model is tested periodically against a separate data set.

Test Target SSIM

When Testing is on, this parameter can stop training early when a target SSIM score is reached. An average SSIM score for the test results is computed by comparing them to the expected outputs. Once it crosses the value specified in this parameter the training stops.

Test Data Source

These parameters determine the data set used to test the model. This data set should consist of images that do NOT already appear in the training data set.

Test Output Directory

Determines the directory test results are written. Test outputs will include both the original input images and the images generated by the model.

Test Input Type

Determines how test data set images should be specified.

Single Paired Image

Tests the model from a single input/reference sample.

Multiple Paired Images

Tests the model from multiple input/reference samples, specified in two directories.

Multiple Unpaired Images

Tests the model using uncorrelated images without a clearly defined relationship. The tests images are specified in two directories.

Composite Paired Images

Tests the model from multiple input/reference samples expressed as composite image pairs in a single directory.

Input Image

When Test Input Type is set to Single Paired Image, determines the path to the sample input image used for testing.

Reference Image

When Test Input Type is set to Single Paired Image, determines the path to the sample reference or output image used for testing.

Input Image Dir

When Test Input Type is set to Multiple Paired Images or Multiple Unpaired Images, determines the path to the directory of sample images used for testing.

Reference Image Dir

When Test Input Type is set to Multiple Paired Images or Multiple Unpaired Images, determines the path to the directory of sample reference or output images used for testing.

Composite Image Dir

When Test Input Type is set to Composite Paired Images, determines the path to the directory of composite images that define an input/reference testing pair.

Test Data Set Storage

Specifies how the input data set should be stored in memory.

Automatic

Keep on Training Device

Worker Threads

Test Batch Size

The number of test samples to load for each test iteration.

Test Configuration

These parameters determine the frequency and size of the model testing step.

Test Frequency

The rate the model is tested, expressed as the number of iterations between test runs.

For example, a value of 1 means the model is tested after each training iteration. A value of 3 will run model testing every 3 iterations.

Test Size

The number of test data set images to use when testing the model. These images are randomly chosen from the test data set specified above.

Add Test Results as Outputs

When on, images written out during the testing process are added as work item outputs.

Checkpointing ¶

Model Checkpointing

Model Path

When on, the raw PyTorch models for each network component are written to disk using the specified path format string. The format string can use the variables described in the overview section.

Model Save Rate

When Model Path is on, determines the rate the PyTorch model files are written to disk. The rate is expressed as the number of training iterations between writing model checkpoints. Checkpoints are always written on the final training iteration, even when not a multiple of the parameter value.

Add Model Checkpoints as Outputs

When on, model checkpoints written out during the training process are added as work item outputs.

ONNX Checkpointing

ONNX Path

When on, the generator component(s) of the model are converted to ONNX and written to disk using the specified format string. The format string can use the variables described in the overview section.

ONNX Save Rate

When ONNX Path is on, determines the rate the ONNX model files are written to disk. The rate is expressed as the number of training iterations between writing model checkpoints. Checkpoints are always written on the final training iteration even when not a multiple of the parameter value.

ONNX Version

Specifies the opset version when writing ONNX files.

Export Dynamic Axes

When on, the input image dimensions are exported as dynamic axes instead of fixed to the size of the training images. This makes it possible to use the exported ONNX model with inputs that aren’t the same size as the training set. However, the quality of the model outputs may be lower.

Add ONNX Checkpoints as Outputs

When on, ONNX model checkpoints written during the training process are added as work item outputs.

Plotting

Plot Type

The plot file format to create, if any.

Disabled

Plotting is disabled.

CSV

Plots are written out as raw CSV files.

Image

Plots are written out as images using the matplotlib library.

Plot Rate

The rate the training plots are written, expressed as a number of iterations.

SSIM Plot

A plot of the SSIM metric recorded during training and testing. This is a rough estimate of the quality of the generated outputs produced by the model in both the training and testing process based on the reference images in the corresponding data sets. If model testing is off, the plot will only including SSIM scores for the training data set.

Loss Plot

A plot of the loss/error values produced when training the model. Generally speaking, if the model is improving over the training process the loss values in the plot should trend towards 0.

Score Plot

A plot of if the discriminator score(s) produced while training the model. These values represent how well the discriminator component of the network is able to distinguish between generated and real images.

Environment

Override Device

When on, override the device that PyTorch uses to train the model.

Set CPU Threads

When the device type is set to “cpu”, this parameter can be used to set the maximum number of CPU threads used during training. It has no effect on the training process when using a GPU device.

Environment Path

The path to the virtual environment on disk.

Python Bin

Determines which Python executable to use when creating the virtual environment and installing packages. Also the version of Python that will be associated with the venv.

Custom Python Bin

When Python Bin is set to Custom, determines the path to the Python interpreter to use when creating the virtual environment.

Use Symlinks

When on, the virtual environment is created with symlinks to the source Python binaries if possible. This is recommened when creating a virtual environment using Houdini’s Python interpreter.

Use Pip Cache

When enabled, pip will attempt to use cached packages on the local system instead of downloading them every time. This can improve the installation times when repeatedly installing the same Python package in different virtual environments.

Schedulers

TOP Scheduler Override

This parameter overrides the TOP scheduler for this node.

Schedule When

When enabled, this parameter can be used to specify an expression that determines which work items from the node should be scheduled. If the expression returns zero for a given work item, that work item will immediately be marked as cooked instead of being queued with a scheduler. If the expression returns a non-zero value, the work item is scheduled normally.

Work Item Label

Determines how the node should label its work items. This parameter allows you to assign non-unique label strings to your work items which are then used to identify the work items in the attribute panel, task bar, and scheduler job names.

Use Default Label

The work items in this node will use the default label from the TOP network, or have no label if the default is unset.

Inherit From Upstream Item

The work items inherit their labels from their parent work items.

Custom Expression

The work item label is set to the Label Expression custom expression which is evaluated for each item.

Node Defines Label

The work item label is defined in the node’s internal logic.

Label Expression

When on, this parameter specifies a custom label for work items created by this node. The parameter can be an expression that includes references to work item attributes or built-in properties. For example, $OS: @pdg_frame will set the label of each work item based on its frame value.

Work Item Priority

This parameter determines how the current scheduler prioritizes the work items in this node.

Inherit From Upstream Item

The work items inherit their priority from their parent items. If a work item has no parent, its priority is set to 0.

Custom Expression

The work item priority is set to the value of Priority Expression.

Node Defines Priority

The work item priority is set based on the node’s own internal priority calculations.

This option is only available on the Python Processor TOP, ROP Fetch TOP, and ROP Output TOP nodes. These nodes define their own prioritization schemes that are implemented in their node logic.

Priority Expression

This parameter specifies an expression for work item priority. The expression is evaluated for each work item in the node.

This parameter is only available when Work Item Priority is set to Custom Expression.