nola602
April 24, 2026 19:38:29
I am using the new copnet in houdini 21 and am trying to translate a layer by a fixed pixel amount. I need this to work as part of a topnet so I cannot manually enter in the correct image size since I will be batch processing images of various sizes.
I have tried to use transform2d but cannot figure out how to translate by a fixed pixel amount, or even how it is working?
I have a rectangular image. When I try to enter in the translation amount the same as x and y dimensions, such as .1, .1 the image translates by the same amount in both dimensions even though my image is rectangular. But what is the reference resolution for this? It doesn't seem to be related to my image size. In addition, there is a crosshair next to the translate inputs that says "click the layer to set the translate in image space" - if I then click my image in the scene view, that doesn't seem to translate correctly my inputs into image space.
Second, since I want to translate by pixels, I need to somehow convert my desired pixel translation into the translate "reference resolution". But how do I get the image resolution in this node? I have tried to res(0, D_XRES) but this always evaluates to 0, and appears to not work in Copernicus (
https://www.sidefx.com/docs/houdini/expressions/res.html)
So to summarize, how does transform2d work with translate - what are the units, and second, how do i translate instead by pixels?
This is a very interesting topic, which I spent a long time puzzled over, trying to understand the logic behind it. Ultimately, it's all quite logical, but how convenient is a good question.
So, the main idea is for us to view or combine images of different resolutions in a normalized form. Our viewing/working area is symmetrical about 0 = (-1,1). This is very convenient for various formulas, centering, and so on. Your coordinates on the plane are immediately equal to the image space by default. This is much more convenient than (0,1). But in some situations, (0,1) is needed, for example, when we want to use UV or brightness as a coordinate. You can simplify the meaning and think in terms of minimum and maximum. Our maximum is always 1, but what will be considered the minimum and, accordingly, the midpoint? Either the minimum = -1 and the midpoint is 0 – this is "image space" (-1,1), or the minimum = 0 and the midpoint is 0.5 – this is "texture space" (0,1). Accordingly, every pixel in any image has coordinates in these spaces, regardless of resolution. When a node or function receives the number 0.1 as a coordinate, it must understand what you mean: is this pixel 10% above the middle, i.e., in the right quadrant, or 10% above the lower left corner? Therefore, it's important to clearly understand which space the node uses to avoid confusion. Typically, all coordinates in all nodes are expected and interpreted internally as (-1,1), with 0 = the middle. But UVs can be interpreted in either way, since maps from files will be in 0-1, while your internal UVs are more conveniently written in (-1,1). In some nodes, you'll see a space selector, while in others, you won't. You need to understand that it expects (-1,1). If you're feeding it a 0-1 gradient, that's the upper-right quadrant in coordinates, and you need to remap it so that your 0 becomes -1 and at least the coordinates are correctly interpreted.
So, any image of any resolution is automatically normalized to (0,1) and (-1,1), and you can add them together, and pixels with the same coordinates will match. When this is convenient, for example, you can use a low-resolution image as a mask for a high-resolution one without worrying about the resolution; everything is correctly averaged and calculated under the hood. Under the hood, sampling is based on the area occupied by the pixel, and the colors are averaged. This means that your pixel's area may contain multiple or fractional chunks of target pixels, and they are automatically averaged over the area, giving you an average color. This allows you to easily achieve beautiful anti-aliasing and interpolation without worrying about pixel matching at all.
You can also change images to higher or lower resolutions—all your effects will remain functional because you're resolution-agnostic.
You also have a proxy mode, and the number of pixels used in calculations doesn't have to match the actual pixels. This means you do part of the work at a low resolution, and then at a high resolution. Pixel snapping makes this impossible.
So there are plenty of reasons against pixel snapping, and they're completely justified. This shouldn't surprise you, of course. If you work in Houdini, you should have long ago realized that any normalized values are more convenient. We convert everything to 0-1 and then multiply it by a custom multiplier. The same applies to UVs, right? All COP nodes, with the exception of a few parameters for blur and crop, work in normalized space.
BUT! The downside of this convenience is when we specifically want pixel-perfect accuracy, and that's the opposite of this logic. Pixel-perfect accuracy implies a tradeoff between universality and proxies, and it's the user's responsibility. There are no ready-made tools for this.
And one more thing! There's a very IMPORTANT point here. When you create or upload an image, it fits into your square, similar to the effect of the matchsize SOP node. That is, if your image is rectangular, which is most common, then the longer side will have the maximum size and coordinates (-1,1), and therefore a length of 2, while the shorter side will be smaller. And for any image, you can't know X or Y if it's less than 1. 1920*1080 and 1080*1920—in the first case, you have (-1,1) on the X axis, in the second case, on the Y axis. You need to know the resolution, and there are no tools for that except Python.
What happens when we decrease the scale, and what does rasterization have to do with it, and how are Transform3D and Transform2D different? Let's take a node file. By default, it's a butterfly, with a resolution of 512*512. Transform3D, rasterization is disabled, you scaled it 0.2, your layer decreased by a factor of 5 in the 3D viewport, but all its 512*512 pixels are still there, even though they've become small! And the image space coordinates haven't changed! The 2D viewport displays in image space, so from its point of view, nothing has changed; the image remains the same, from -1 to 1. If you blend over it onto a 1024*1024 checkerboard (like the BG), your butterfly will be projected onto the BG pixels as it is in the physical area. The starting squares were identical, so we scaled it down by a factor of 5. Physically, the butterfly layer's area occupies 1/5 * 1/5 of the BG square in pixels = 1024. This means our 512 * 512 FG pixels are projected onto 1/5 * 1024 = 204 * 204 BG pixels. Each BG pixel will receive the average color from approximately 4 FG pixels and average them. Automatically, hurray!
Now for option 2: we pressed rasterize in Transform3D. Our scaled-down butterfly image was projected onto a clean layer of its own resolution, i.e., into pixels 1/5 of 512, approximately 100 * 100 pixels. This is now a new layer, where the butterfly occupies 100 * 100 pixels and is quite murky. The layer itself has a default size of 2 * 2, like any new layer. In the 2D viewer, you will now see a small butterfly because the image space now describes a new full 512 * 512 layer. Yes, distinguish pixels as objects from the pixel color obtained from the projection! If you have the default wrap border, you'll get multiple, repeating butterflies; if you use clamp, you'll get a single butterfly in the center on a pure black field. Honestly, for me, this kind of multiplication is the only situation where rasterization in Transform3D would be necessary. For a blend with another layer, you always want to project onto the target layer at maximum resolution, and you don't need intermediate rasterization.
When you change the translate, these are world units, meaning you shift by 1, and your butterfly will always fall from the center (0) to the edge of the square (1).
Transform2D is essentially a clone of Transform3D with a locked Z axis. You can only move in two coordinates and rotate only on the Z axis. In Transform3D, you can tilt the plane by rotating it along the X axis and get perspective distortion for free by enabling the perspective camera checkbox, as the camera shoots from above along the Z axis.
But how do you achieve pixel matching, since the projection interpolates by default?
Probably the easiest way is to continue using Transform, but you need to use Python code to convert your offsets from the number of pixels to normalized coordinates and then substitute this number into the parameter as an expression. Think of the pixel grid as voxels with dimensions. Let's say your BG is 1024*1024, and you're blending an FG of 100*200. In normalized space, their edges match, but you don't want that. You need to physically reduce your FG so that the pixel sizes match. What is the pixel size? Your layer's physical size is 2 (from -1 to 1), which is 1024 pixels for the BG. This means 1 pixel = 2/1024 - this is your step for Transform3D for 1 pixel. But your FG currently has a physical pixel size of 2/200, which is much larger. You first need to reduce your FG by the same factor, i.e., scale = (2/1024) / (2/200) = 200/1024. After that, your pixels will begin to match. And you can discretely move it with the specified step (simply a parameter for the number of pixels and an expression in the translate as a step multiplied by the number of pixels from the parameter). Here, it is imperative to also hit the center of the pixel, otherwise, although the pixel size is the same, it will project onto several pixels. Also, pay attention to the Filter parameter in the blend. To avoid interpolation, set the first item to point. The rest interpolates neighbors with increasing force; this is useful for some cases, but not necessary for pixel-perfect matching.
Here you might be wondering how to determine the resolution of an arbitrary input image. Something like this for input 0:
node = hou.pwd()
input_node = node.inputs(){0}
if input_node:
return input_node.layer().bufferResolution(){0}
return 0
Next, calculate the steps, save the preset in the transform, and voila. Don't forget to take the maximum component; remember that the maximum side will have coordinates in image space and a length of 2.
In general, I'm sure that the option to scale the current layer to match the pixel size of the target layer should be a basic transform option, but it's not there yet. so only with python expressions.
!!! Please note that the code should use square brackets instead of curly brackets, but for some reason the square brackets and text within them aren't showing up here. This is very strange, so be sure to change them manually. It's also possible that the text within the square brackets is missing, but I think you get the idea.
If you want to do everything quickly and within a snippet, then welcome to OpenCL, where you need to solve this. There are functions that convert normalized coordinates to pixels, though through buffer coordinates, but this can be solved.
@name.imageToBuffer(xy)
Returns the image coordinate transformed to buffer coordinates.
@name.bufferToImage(xy)
Returns the buffer coordinate transformed to image coordinates.
@name.pixelToBuffer(xy)
Returns the pixel coordinate transformed to buffer coordinates.
@name.bufferToPixel(xy)
Returns the buffer coordinate transformed to pixel coordinates.
So you can accurately calculate pixels and coordinates here. It'll take a bit of fiddling to figure it out, but the speed will be instant.