Creating a filter COP

This is an example of a 3x3 kernel filter which uses template classes to abstract the operation for various data formats. It also demonstrates how to deal with fetching input areas larger than a tile, and how to enlarging the canvas for the COP.

This section walks through COP2_SimpleFilter.C.

#include <UT/UT_DSOVersion.h>
#include <OP/OP_OperatorTable.h>
#include <PRM/PRM_Include.h>
#include <PRM/PRM_Parm.h>

These are required for any add-on node.

#include <SYS/SYS_Math.h>

This is required for SYSmin().

#include <TIL/TIL_Region.h>
#include <TIL/TIL_Tile.h>
#include <TIL/TIL_TileList.h>

These headers are required when working with tiles and regions. Most COPs will require all three.

#include <PXL/PXL_Pixel.h>

#include <RU/RU_Algorithm.h>

These headers are required to implement a new data-independent RU_Algorithm.

#include <COP2/COP2_CookAreaInfo.h>

#include "COP2_SampleFilter.h"

COP2_CookAreaInfo is required whenever getInputDependenciesForOutputArea() is overridden. And finally, we include our own header for the class definition of the node (COP2_SampleFilter).

COP2_MASK_SWITCHER(4, "HDK Sample Filter");
static PRM_Name names[] =
{
    PRM_Name("left",            "Left Enhance"),
    PRM_Name("right",           "Right Enhance"),
    PRM_Name("top",             "Top Enhance"),
    PRM_Name("bottom",          "Bottom Enhance"),
};
PRM_Template
COP2_SampleFilter::myTemplateList[] =
{
    PRM_Template(PRM_SWITCHER,  3, &PRMswitcherName, switcher),
    PRM_Template(PRM_FLT_J,     TOOL_PARM, 1, &names[0],PRMzeroDefaults),
    PRM_Template(PRM_FLT_J,     TOOL_PARM, 1, &names[1],PRMzeroDefaults),
    PRM_Template(PRM_FLT_J,     TOOL_PARM, 1, &names[2],PRMzeroDefaults),
    PRM_Template(PRM_FLT_J,     TOOL_PARM, 1, &names[3],PRMzeroDefaults),
    
    PRM_Template(),
};
OP_TemplatePair COP2_SampleFilter::myTemplatePair( 
     COP2_SampleFilter::myTemplateList,
    &COP2_MaskOp::myTemplatePair);

This block of code defines 4 parameters in a tab named "HDK Sample Filter", one of 3 tabs in the parm dialog (the other two are Mask and Frame Scope).

OP_VariablePair COP2_SampleFilter::myVariablePair( 0,&COP2_Node::myVariablePair );
const char *    COP2_SampleFilter::myInputLabels[] =
{
    "Image to Enhance",
    "Mask Input",
    0
};

This block defines any local variables specific to this node (none) and the input labels for our 2 inputs.

OP_Node *
COP2_SampleFilter::myConstructor(OP_Network     *net,
                                 const char     *name,
                                 OP_Operator    *op)
{
    return new COP2_SampleFilter(net, name, op);
}
COP2_SampleFilter::COP2_SampleFilter(OP_Network *parent,
                               const char *name,
                               OP_Operator *entry)
    : COP2_MaskOp(parent, name, entry)
{}
COP2_SampleFilter::~COP2_SampleFilter()
{}

The constructors and destructors. This node derives from COP2_MaskOp since a kernel filter can be masked.

Next we have the newContextData() method which populates an empty cop2_SampleFilterContext, defined in the header as:

class cop2_SampleFilterContext : public COP2_ContextData
{
public:
                 cop2_SampleFilterContext()
                     : myLeft(0.0f), myRight(0.0f), myTop(0.0f), myBottom(0.0f),
                       myKernel(NULL)
                 {}
    virtual     ~cop2_SampleFilterContext() { delete [] myKernel; }
    virtual bool createPerPlane() const { return false; }
    virtual bool createPerRes() const   { return true; }
    virtual bool createPerTime() const  { return true; }
    virtual bool createPerThread() const{ return false; }
    /// Parameters
    float       myLeft;
    float       myRight;
    float       myTop;
    float       myBottom;
    
    /// Kernel filter derived from parameters
    float      *myKernel;
};

This context data not only stashes the parameter values, but also the result of those parameters, a 3x3 matrix kernel filter. Note that any memory allocated within the context data object must be freed by that object.

COP2_ContextData *
COP2_SampleFilter::newContextData(const TIL_Plane *, int ,
                                  float t, int xres, int yres,
                                  int , int)
{
    // Necessary since parameters cannot be evaluated in doCookMyTile
    
    cop2_SampleFilterContext *data = new cop2_SampleFilterContext;
    float scx, scy;
    // The frame/scope effect allows the user to dial down the entire operation.
    int         index = mySequence.getImageIndex(t);
    float       effect = getFrameScopeEffect(index);
    // If cooking at a reduced res, scale down the effect for a closer
    // approximation.
    getScaleFactors(xres,yres, scx, scy);
    effect *= SYSmin(scx,scy);
    
    data->myLeft   = LEFT(t)    * effect;
    data->myRight  = RIGHT(t)   * effect;
    data->myTop    = TOP(t)     * effect;
    data->myBottom = BOTTOM(t)  * effect;
    data->myKernel = new float[9];
    // Kernel positions:
    // 0 1 2
    // 3 4 5 
    // 6 7 8
    data->myKernel[0] = -data->myLeft -data->myTop;
    data->myKernel[1] = -data->myTop;
    data->myKernel[2] = -data->myRight -data->myTop;
    data->myKernel[3] = -data->myLeft;
    data->myKernel[5] = -data->myRight;
    data->myKernel[6] = -data->myLeft -data->myBottom;
    data->myKernel[7] = -data->myBottom;
    data->myKernel[8] = -data->myRight -data->myBottom;
    // center
    data->myKernel[4] = 1.0f + 3.0f * (data->myLeft + data->myRight +
                                       data->myTop + data->myBottom);
    return data;
};

The purpose of this method is to evaluate and cache all parameters required for the cook in an object that can be reused by multiple threads across multiple tiles. It can also cache other data, such as the kernel computed here.

The getFrameScropEffect() call returns the cumulative effect of the Effect parameter and the parameters on the Frame Scope page. It is normally 1, but can be reduced down to 0 if the user changes these parms. It affects the kernel by reducing the edge detection factors.

The getScaleFactors() call returns the reduction in cooking resolution that can happen when Fast Interactive Cooking is on or the user cooks at a reduced resolution in the viewport (such as 50%). In order to more closely approximate the full resolution effect, the effect is reduced by the scale factor.

Next the parameters are evaluated and reduced by the effect factor. From those, a piecemeal sharpen filter is constructed and stored in the context data object. Finally, the new context data object is returned.

void
COP2_SampleFilter::computeImageBounds(COP2_Context &context)
{
    int x1,y1,x2,y2;
    // Grab the bounds from the mask op, which combines the mask with the input
    // bounds.
    COP2_MaskOp::computeImageBounds(context);
    // Now enlarge the bounds by 1 in each direction to account for the 3x3
    // kernel.
    context.getImageBounds(x1,y1,x2,y2);
    context.setImageBounds(x1-1, y1-1, x2+1, y2+1);
}

As our filter uses a 3x3 kernel, this can expand the canvas by 1 pixel in each direction. The COP2_MaskOp version must be called to account for the union of the mask image with the filtered image. Finally, we grab the bounds and assign it back to the context, one pixel larger in each direction.

void
COP2_SampleFilter::getInputDependenciesForOutputArea(
    COP2_CookAreaInfo &output_area,
    const COP2_CookAreaList &input_areas,
    COP2_CookAreaList &needed_areas)
{
    // Add dependencies on the first input and the mask plane.
    COP2_MaskOp::getInputDependenciesForOutputArea(output_area, input_areas,
                                                   needed_areas);
    // If bypassed, don't do anything else.
    if (getBypass())
        return;
    // Enlarge the needed area of the first input by 1 pixel in all directions.
    COP2_CookAreaInfo *inarea =
        makeOutputAreaDependOnMyPlane(0, output_area,input_areas,needed_areas);
    // It may not exist if the input node has an error.
    if(inarea)
        inarea->expandNeededArea(1, 1, 1, 1);
}

In order for the COP scheduler to effectively schedule tile cooking across many threads, it needs to build a data dependency tree. It passes to getInputDependenciesForOutputArea() an output_area that is required. From this, the method should add any input images that are needed to the needed_areas array.

The easiest way to do this is with the makeOutputDependOn...() family of methods. These methods add a dependency on the specified input image area, and return the COP2_CookAreaInfo that represents this. The various methods are:

COP2_Node::makeOutputAreaDependOnAllInputAreas()
All planes in all inputs are added to the needed_areas list. For when you really don't know beforehand, but use very sparingly.
COP2_Node::makeOutputAreaDependOnInputAreas()
All planes in the input are added to the needed_areas list.
COP2_Node::makeOutputAreaDependOnInputPlane()
The specified plane for an input is added.
COP2_Node::makeOutputAreaDependOnMyPlane()
The plane corresponding to the one requested in output_area is added for the given input.

In this case, makeOutputAreaDependOnMyPlane() is used, as the filter does not require any extra input planes to produce the output. For nodes derived from COP2_MaskOp, a call to the parent class version of getInputDependenciesForOutputArea() should be made first to account for the mask.

Finally, we expand the bounds of the inarea by 1 pixel in all directions, as the filter needs an extra neighbor pixel to do a 3x3 filter (a 5x5 would require 2 in all directions, etc).

const char  *
COP2_SampleFilter::getOperationInfo()
{
    return "This operation enhances individual edges.";
}

Time to take a break from the complicated stuff for a moment. This method returns an plain English description of the operation. This will appear in the operator info popup.

class cop2_EdgeEnhance : public RU_Algorithm
{
public:
             cop2_EdgeEnhance(const float *kernel) : myKernel(kernel) {}
    virtual ~cop2_EdgeEnhance() {}
    DECLARE_FILTER_OP(cop2_EdgeEnhanceOp);
    const float *myKernel;
};
template<class Type,int fast> class cop2_EdgeEnhanceOp
    : public RU_FilterOp<Type,fast>
{
public:
                 cop2_EdgeEnhanceOp(RU_Algorithm *alg)
                     : RU_FilterOp<Type,fast>(alg)
                 { ; }
    virtual      ~cop2_EdgeEnhanceOp() {;}
    virtual int filter(TIL_TileList *output,
                       const TIL_Region *input, float t,
                       int thread=-1, void *data=0);
};
IMPLEMENT_FILTER_OP(cop2_EdgeEnhance, cop2_EdgeEnhanceOp);

In order to run the filter on any incoming data type, from 8b int to 32b FP, an RU_Algorithm can be used. This uses C++ template to generate code for the various data types. An RU_Algorithm-based operation requires two classes - a trivial container class (in this case, cop2_EdgeEnhance) and the template class that implements the algorithm (cop2_EdgeEnhanceOp).

The container class calls the appropriate template based on the data types, hiding a large if()/else if() codepath. It also contains any parameters that the algorithm needs to function - in this case, just a pointer to the kernel.

The template class can be one of four different types:

RU_GeneratorOp - Generates image data without using an input image
RU_PixelOp - Adjusts an image using an input image, without using neighboring pixels
RU_FilterOp - Filters an image using an input image, with neighboring pixels
RU_BinaryOp - Combines two images into one image

For this example, we're using an RU_FilterOp. The RU_Algorithm class uses the DECLARE_FILTER_OP() macro to interface with the template class. There are other macros that correspond to the other template class types, DECLARE_GENERATOR_OP, DECLARE_PIXEL_OP and DECLARE_BINARY_OP().

template<class Type,int fast> int
cop2_EdgeEnhanceOp<Type,fast>::filter(TIL_TileList *output,
                                      const TIL_Region *input, float t,
                                      int thread, void *data)
{
    PXL_Pixel<Type,fast> pixel(output->myBlack, output->myWhite);
    cop2_EdgeEnhance *parm = static_cast<cop2_EdgeEnhance *>(myAlg);
    const float *kernel = parm->myKernel;
    TIL_Tile    *itr=0;
    const Type  *source_data, *iscan1, *iscan2, *iscan3;
    Type        *dest_data, *scan;
    int          ti;
    int          stride, istride;
    float        sum;
    int          x,y;
    int          w,h;
    w = output->myX2 - output->myX1 + 1;
    h = output->myY2 - output->myY1 + 1;
    stride = w;
    istride = w + 2;
    
    FOR_EACH_UNCOOKED_TILE(output, itr, ti)
    {
        dest_data     = (Type *) itr->getImageData();
        source_data   = (Type *) input->getImageData(ti);
        // 3 scanlines for a 3x3 kernel
        iscan1 = source_data + 1;
        iscan2 = iscan1 + istride;
        iscan3 = iscan2 + istride;
        scan = dest_data;
        for(y=0; y<h; y++)
        {
            for(x=0; x<w; x++)
            {
                pixel.set(iscan1[x-1]);
                sum  = (float)pixel * kernel[0];
                
                pixel.set(iscan1[x]);
                sum += (float)pixel * kernel[1];
                
                pixel.set(iscan1[x+1]);
                sum += (float)pixel * kernel[2];
                pixel.set(iscan2[x-1]);
                sum += (float)pixel * kernel[3];
                
                pixel.set(iscan2[x]);
                sum += (float)pixel * kernel[4];
                
                pixel.set(iscan2[x+1]);
                sum += (float)pixel * kernel[5];
                pixel.set(iscan3[x-1]);
                sum += (float)pixel * kernel[6];
                
                pixel.set(iscan3[x]);
                sum += (float)pixel * kernel[7];
                
                pixel.set(iscan3[x+1]);
                sum += (float)pixel * kernel[8];
                // Assign to the output array
                pixel = sum;
                scan[x] = pixel.getValue();
            }
            
            scan += stride;
            iscan1 += istride;
            iscan2 += istride;
            iscan3 += istride;
        }
    }
    return 1;
}

This is the heart of the COP - the templated operation itself. In order to do the various data conversions, PXL_Pixel<Type,fast> is used. By using its set() method, the native data type is assigned to the pixel. By casting it to a float, it is easy to work with. And finally, the reverse can be done when assigning data back, assigning the float sum to it, and then extracting the native value using getValue().

The algorithm itself iterates over each tile in the list. It grabs the output tile and the corresponding input region image data, which is 1 pixel larger in each direction than the tile.

The application of the kernel is done by accessing the eight neighboring pixels and the corresponding pixel, multiplying them by the kernel matrix, and then summing the results.

Note: This algorithm can be optimized a lot more than shown using a sliding window, specialized templates and MMX/SSE intrinsics, but for this example it's being kept simple.

OP_ERROR
COP2_SampleFilter::doCookMyTile(COP2_Context &context, TIL_TileList *tiles)
{
    // Grab our context data.
    cop2_SampleFilterContext *data =
        static_cast<cop2_SampleFilterContext *>(context.data());
    // Grab the input image data that we need for our tile area.
    TIL_Region *in = inputRegion(0, context,
                                 tiles->myX1 -1,
                                 tiles->myY1 -1,
                                 tiles->myX2 +1,
                                 tiles->myY2 +1,
                                 TIL_HOLD); // streak edges when outside canvas
    if(!in)
    {
        tiles->clearToBlack();
        return error();
    }
    // call the templated operation
    cop2_EdgeEnhance op(data->myKernel);
    op.filter(tiles, in, context.getTime(), NULL, context.myThreadIndex);
    releaseRegion(in);
    // done - return any errors. 
    return error();
}

First, we fetch our cop2_SampleFilterContext from the context passed to us.

The cook method then fetches the input region that is required and passes it to the cop2_EdgeEnhance algorithm along with the kernel matrix. It then releases the input region. If for any reason the input region could not be accessed, it clears the tiles to black and exits early.

A region is a convenient abstraction to use, as it can marshall any area of the tiled input into a contiguous block; a subimage of the input. You can also convert the region to floating point (or any data format) by using a signature of inputRegion() that takes an extra TIL_Plane pointer. You can copy the context's plane and then assign it a new data format using setFormat().

Regions keep the components non-interleaved; each component is its own single channel subimage. These subimages can be accessed by TIL_Region::getImageData().

void
newCop2Operator(OP_OperatorTable *table)
{
    table->addOperator(new OP_Operator("hdksamplefilt",
                                       "HDK Sample Filter",
                                       COP2_SampleFilter::myConstructor,
                                       &COP2_SampleFilter::myTemplatePair,
                                       1, 
                                       2, // optional mask input.
                                       0, // no vars
                                       0, // not generator
                                       COP2_SampleFilter::myInputLabels));
}

Finally, all HDK OPs require registration. This operator takes a required input for the image to filter, and an optional input for the mask. It has no local variables and is not a generator. The input labels defined above are passed as well, to improve the usability of the node tile.