These functions allows arbitrary looping operations to be performed in parallel, using a versatile multi-threading framework. It works hand-in-hand with the single-threaded looping functions, and can be used to code up complex operations with relatively little effort.
The ThreadedLoop() function is generally used by first initialising an object that will determine the order of traversal, which axes will be looped over between synchronisation calls, and what message to display in the progress bar if one is needed. This is all performed in the various overloaded versions of ThreadedLoop(), which are template functions allowing the class to be initialised from any HeaderType
class. These are described in more detail below.
The object returned by the ThreadedLoop() functions provide methods to loop over ImageType classes. To do this, the object returned by ThreadedLoop() keeps track of which axes will be managed by each thread independently (the inner axes), and which will be coordinated across threads (the outer axes). By default, the inner axes consist of a single axis, chosen to be the axis of smallest stride in the source HeaderType
class provided at initialisation. The remaining axes are coordinated across threads: each invocation of the thread's functor is given a fresh position to operate from, in the form of an Iterator class.
To help illustrate the concept, consider an example whereby the inner axes have been set to the x & y axes (i.e. axes 0 & 1), and the outer axes have been set to the z and volume axes (i.e. axes 2 & 3). Each thread will do the following:
ImageType
classes to be processed according to these coordinates;The ThreadedLoop() functions can be used to set up any reasonable multi-threaded looping structure. The various relevant aspects that can be influenced include:
HeaderType
class will be included.The various overloaded functions allow these settings to be specified in different ways. A HeaderType
class is always required; it is generally recommended to provide the (or one of the) input ImageType
class that is to be processed, since its order of traversal will have the most influence on performance (by making the most efficient use of the hardware's RAM access and CPU cache). It is also possible to supply a vector<size_t>
directly if required.
These axes can be restricted to a specific range to allow volume-wise processing, etc. The inner axes can be specified by supplying how many are needed; they will then be taken from the list of axes to be looped over, in order of increasing stride. It is also possible to provide the list of inners axes and outer axes as separate vector<size_t>
arguments.
The run() methods will run the ThreadedLoop, invoking the specified function or functor once per voxel in the loop. The different versions will expect different signatures for the function or functor, and manage looping over different numbers of ImageType
classes. This should be clarified by examining the examples below.
This example can be used to compute the exponential of the voxel values in-place, while displaying a progress message:
To make this operation more easily generalisable to any ImageType
, use a functor with a template operator() method:
Note that a simple operation such as the previous example can be written more compactly using C++11 lambda expressions:
As a further example, the following snippet performs the addition of any ImageTypes
vox1 and vox2, this time storing the results in vox_out, with no progress display, and looping according to the strides in vox1:
Note that the ImageType
arguments to run() must be provided in the same order as expected by the Functor's operator()
method.
Again, such a simple operation can be written more compactly using C++11 lambda expressions:
This example uses a functor to computes the root-mean-square of vox:
The run_outer() method can be used if needed to loop over the indices in the outer loop only. This method is used internally by the run() methods, and may prove useful in selected cases if each thread needs to handle its own looping over the inner axes. Usage is essentially identical to the run() method, and the function or functor provided will need to have a void operator() (const Iterator& pos) method defined. Note that in this case, any ImageType
classes to be looped over will need to be stored as members to the functor, since it will only receive an Iterator
for each invocation - the functor will need to then implement looping over the inner axes from the position provided in the Iterator
.