HDK
 All Classes Namespaces Files Functions Variables Typedefs Enumerations Enumerator Friends Macros Groups Pages
UT_CoarsenedRange< RANGE > Class Template Reference

#include <UT_ParallelUtil.h>

+ Inheritance diagram for UT_CoarsenedRange< RANGE >:

Public Member Functions

 UT_CoarsenedRange (UT_CoarsenedRange &range, tbb::split spl)
 
bool is_divisible () const
 

Friends

template<typename Range , typename Body >
void UTparallelFor (const Range &range, const Body &body, const int subscribe_ratio, const int min_grain_size, const bool force_use_task_scope)
 
template<typename Range , typename Body >
void UTparallelReduce (const Range &range, Body &body, const int subscribe_ratio, const int min_grain_size, const bool force_use_taskscope)
 
template<typename Range , typename Body >
void UTparallelDeterministicReduce (const Range &range, Body &body, const int grain_size, const bool force_use_taskscope)
 

Detailed Description

template<typename RANGE>
class UT_CoarsenedRange< RANGE >

UT_CoarsenedRange: This should be used only inside UT_ParallelFor and UT_ParallelReduce This class wraps an existing range with a new range. This allows us to use simple_partitioner, rather than auto_partitioner, which has disastrous performance with the default grain size in ttb 4.

Definition at line 100 of file UT_ParallelUtil.h.

Constructor & Destructor Documentation

template<typename RANGE>
UT_CoarsenedRange< RANGE >::UT_CoarsenedRange ( UT_CoarsenedRange< RANGE > &  range,
tbb::split  spl 
)
inline

Definition at line 108 of file UT_ParallelUtil.h.

Member Function Documentation

template<typename RANGE>
bool UT_CoarsenedRange< RANGE >::is_divisible ( ) const
inline

Definition at line 116 of file UT_ParallelUtil.h.

Friends And Related Function Documentation

template<typename RANGE>
template<typename Range , typename Body >
void UTparallelDeterministicReduce ( const Range &  range,
Body &  body,
const int  grain_size,
const bool  force_use_taskscope = true 
)
friend

This is a simple wrapper for deterministic reduce that uses tbb. It works in the same manner as UTparallelReduce, with the following differences:

  • reduction and join order is deterministic (devoid of threading uncertainty;
  • a fixed grain size must be provided by the caller; grain size is not adjusted based on the available resources (this is required to satisfy determinism). This version should be used when task joining is not associative (such as accumulation of a floating point residual).

Definition at line 776 of file UT_ParallelUtil.h.

template<typename RANGE>
template<typename Range , typename Body >
void UTparallelFor ( const Range &  range,
const Body &  body,
const int  subscribe_ratio = 2,
const int  min_grain_size = 1,
const bool  force_use_task_scope = true 
)
friend

Run the body function over a range in parallel. UTparallelFor attempts to spread the range out over at most subscribe_ratio * num_processor tasks. The factor subscribe_ratio can be used to help balance the load. UTparallelFor() uses tbb for its implementation. The used grain size is the maximum of min_grain_size and if UTestimatedNumItems(range) / (subscribe_ratio * num_processor). If subscribe_ratio == 0, then a grain size of min_grain_size will be used. A range can be split only when UTestimatedNumItems(range) exceeds the grain size the range is divisible. Requirements for the Range functor are:

  • the requirements of the tbb Range Concept
  • UT_estimatorNumItems<Range> must return the the estimated number of work items for the range. When Range::size() is not the correct estimate, then a (partial) specialization of UT_estimatorNumItemsimatorRange must be provided for the type Range.

Requirements for the Body function are:

  • Body(const Body &);

    Copy Constructor
  • Body()::~Body();

    Destructor
  • void Body::operator()(const Range &range) const;
    Function call to perform operation on the range. Note the operator is const.

The requirements for a Range object are:

  • Range::Range(const Range&);

    Copy constructor
  • Range::~Range();

    Destructor
  • bool Range::is_divisible() const;

    True if the range can be partitioned into two sub-ranges
  • bool Range::empty() const;

    True if the range is empty
  • Range::Range(Range &r, UT_Split) const;

    Split the range r into two sub-ranges (i.e. modify r and *this)

Example:

class Square {
public:
Square(fpreal *data) : myData(data) {}
~Square();
void operator()(const UT_BlockedRange<int64> &range) const
{
for (int64 i = range.begin(); i != range.end(); ++i)
myData[i] *= myData[i];
}
fpreal *myData;
};
...
void
parallel_square(fpreal *array, int64 length)
{
UTparallelFor(UT_BlockedRange<int64>(0, length), Square(array));
}
See Also
UTparallelReduce(), UT_BlockedRange()

Definition at line 292 of file UT_ParallelUtil.h.

template<typename RANGE>
template<typename Range , typename Body >
void UTparallelReduce ( const Range &  range,
Body &  body,
const int  subscribe_ratio = 2,
const int  min_grain_size = 1,
const bool  force_use_taskscope = true 
)
friend

UTparallelReduce() is a simple wrapper that uses tbb for its implementation. Run the body function over a range in parallel.

WARNING: The operator()() and join() functions MUST NOT initialize data! Both of these functions MUST ONLY accumulate data! This is because TBB may re-use body objects for multiple ranges. Effectively, operator()() must act as an in-place join operation for data as it comes in. Initialization must be kept to the constructors of Body.

Requirements for the Body function are:

  • Body()::~Body();

    Destructor
  • Body::Body(Body &r, UT_Split) const;

    The splitting constructor. WARNING: This must be able to run concurrently with calls to r.operator()() and r.join(), so this should not copy values accumulating in r.
  • void Body::operator()(const Range &range);
    Function call to perform operation on the range. Note the operator is not const.
  • void Body::join(const Body &other);
    Join the results from another operation with this operation. not const.

The requirements for a Range object are:

  • Range::Range(const Range&);

    Copy constructor
  • Range::~Range();

    Destructor
  • bool Range::is_divisible() const;

    True if the range can be partitioned into two sub-ranges
  • bool Range::empty() const;

    True if the range is empty
  • Range::Range(Range &r, UT_Split) const;

    Split the range r into two sub-ranges (i.e. modify r and *this)

Example:

class Dot {
public:
Dot(const fpreal *a, const fpreal *b)
: myA(a)
, myB(b)
, mySum(0)
{}
Dot(Dot &src, UT_Split)
: myA(src.myA)
, myB(src.myB)
, mySum(0)
{}
void operator()(const UT_BlockedRange<int64> &range)
{
for (int64 i = range.begin(); i != range.end(); ++i)
mySum += myA[i] * myB[i];
}
void join(const Dot &other)
{
mySum += other.mySum;
}
fpreal mySum;
const fpreal *myA, *myB;
};
parallel_dot(const fpreal *a, const fpreal *b, int64 length)
{
Dot body(a, b);
return body.mySum;
}
See Also
UTparallelFor(), UT_BlockedRange()

Definition at line 716 of file UT_ParallelUtil.h.


The documentation for this class was generated from the following file: