Houdini 20.0 Nodes Geometry nodes

Principal Component Analysis geometry node

Computes the principle components of volume or point data.

Principal Component Analysis , or PCA, is a way to reduce high dimensional data into the most important components. It does this by finding features that reflect the most variation in the data.

For a point cloud, this will correspond to the most important directions, so often is used for oriented box fitting.

After the principal components are determined, further samples can be projected onto the components. This gives a different view of the data where the most important dimensions are provided first. The sample can then be reconstructed by linearly combining the components with the computed weights.

By only using a few of the principal components in the reconstruction this acts as a filter removing the least important noise. Given temporal input data this forms a sort of de-jitter filter that targets uncorrelated noise.

For an example that shows how linear blend skinning can be improved on by learning from random poses, download the ML Deformer files from the Content Library.

Parameters

Data Type

The source data to use for analysis.

Point Attributes

The P position attribute is used. This is batched into consecutive groups; so all the input samples should have the same number of points and be concatenated together.

Volumes

Each volume becomes a sample. All the volumes must match in resolution and tuple-size.

Attributes

A space separated list of point attributes to analyse. To save memory, only these are output on the list of modes.

In the reconstruction and projection modes, attributes on this list will be used for the projection and reconstruction.

Points per Sample

How many points will form each sample for the analysis. This must divide evenly into the total number of points.

Mode

In addition to computing the PCA, this node can also project and reconstruct.

Analyse

Find the principal components of the input data. The output will consist of the average along with the specified number of orthonormal components. The eigenvalue associated with each component is also provided in the eval attribute - this gives an indication of the importance of it. The component attribute stores which component it belongs to, with 0 being the average.

Note

This can be a very slow operation as the dimensionality or sample size increases. The cost is proportional to the minimum of the two, but grows very quickly, so you will want either the dimension or the sample size to be less than 25,000.

Scree Chart

A Scree Chart provides a way to intuitively see where additional components will form diminishing returns. Two graphs are given; one is the relative contribution of each component; the other is the cumulative total of all components up to that point. Usually a bend in the graph is used as a signal of where additional components are noise. But one can also use the cumulative total to make sure a certain percentage of the variation is captured.

Project

This projects the input sample onto the set of components specified by the second input. The second input is usually the result of another Principal Component Analysis SOP in Analyse mode.

The sample is projected onto each of the components generating an output point with a weight attribute specifying how much that component contributed. Linearly combining all the components with these weights will give the best reconstruction possible of the input sample.

Reconstruct

Given a set of points with weight attributes; the second input’s components are linearly combined to form the output. This can be used to reconstruct a known sample. If it was projected with fewer components than the full dimensionality, this will remove the least-important variations in the original sample set, thereby filtering out uncorrelated noise.

The weights can also come from other sources, or be scaled to achieve interesting effects.

Include Mean Weights

In literature, PCA is usually described as something run over normalized data where the mean has already been extracted. However, for convenience, the mean of the input is included in the output of the analysis. Likewise, the project and reconstruct modes will include as the first weighting point the weight for the mean. In projection this is always one, and in reconstruction it likewise will often be one.

Turning this off will cause the projection not to output the mean weight and the reconstruction to assume it is one. This can simplify exporting these weights to other processes.

Number of Components

How many components to extract. -1 will extract all the components allowing a full reconstruction of the input samples.

X Scale

The horizontal scale of the scree chart.

Y Scale

The vertical scale of the scree chart.

Prop Var Color

The color of the proportional variance of each component in the scree chart.

Cum Var Color

The color of the cumulative variance in the scree chart.

See also

Geometry nodes