Generic algorithms for 2D spatial fields II: Data type

In the previous post we had a quick look at generic algorithms, spatial fields and the requirements of generic algorithms for 2D spatial fields. In this post we are going to tackle the first requirement:

It must be possible for the user to pick the data type and value type of the input and output arguments. The data type does not need to be picked from a fixed set of types. The user must be able to use her own types.

This requirement is the easiest to fulfill. By turning our previous function into a function template, we can parameterize the function by the data types of the arguments and the result.

#pragma once
#include <cmath>


template<
    typename Base,
    typename Exponent,
    typename Result>
inline void pow(
    Base const& base,
    Exponent const& exponent,
    Result& result)
{
    size_t const nr_rows{base.shape()[0]};
    size_t const nr_cols{base.shape()[1]};

    for(size_t r = 0; r < nr_rows; ++r) {
        for(size_t c = 0; c < nr_cols; ++c) {
            result[r] = std::pow(base[r], exponent);
        }
    }
}

This algorithm can be called like this:

#include <cstddef>
#include "fern/feature/core/raster.h"
#include "pow.h"


int main(
    int /* argc */,
    char** /* argv */)
{
    using Raster = fern::Raster<double, 2>;

    size_t const nr_rows{2};
    size_t const nr_cols{3};
    double west{0.0};
    double north{0.0};
    double cell_size{10.0};
    Raster::Transformation transformation{{west, cell_size, north, cell_size}};
    Raster base(fern::extents[nr_rows][nr_cols], transformation);

    // Assign values to the raster.
    // ...

    double exponent{2};

    Raster result(fern::extents[nr_rows][nr_cols], transformation);

    pow(base, exponent, result);

    return EXIT_SUCCESS;
}

Some observations:

  • The result is now passed as a writable argument to the function. The compiler instantiates a function for each unique set of template arguments that we provide. By turning the result into an argument, these instantiations are involved in overload resolution allowing the compiler to pick the best match. This is relevant if we call the algorithm for the same Base and Exponent types, but different Result types. If we had returned the result the compiler would have thrown an error complaining about ambiguous overloads. It cannot choose between two functions with the same argument types, but different return types.
  • Although we turned the argument and result data types into template arguments, we are still limiting our users to the fern::Raster interface. Only raster types adhering to this interface can be passed. Ideally, we want to decouple the algorithm from a specific data type’s interface, allowing the user to use any raster-like type, whatever its interface.
  • Although Base is a raster-like type, Exponent must still be a floating point type. We would like to also be able to pass a raster-like type for the exponent argument.

Let’s tackle these limitations in turn. A technique to decouple an algorithm from its argument’s interfaces is to use customization points. And we can use a technique called tag dispatching to select an algorithm’s implementation based on properties of the template’s argument types.

Customization points

Let us forget about a single specific class for representing rasters and focus on the minimal set of requirements that the pow algorithm poses on the arguments passed in. Given that the algorithm currently takes a raster as its first argument and a number as its second, we can define the following requirements:

  • It must be possible to determine the number of rows and columns in the base argument raster.
  • It must be possible to iterate over all raster cells and obtain a readable value of each cell from the base argument raster and a writable reference of each cell in the result raster.

Customization points are free function templates that return a property of an argument and which can be overloaded for different argument types. One example from the C++ Standard Library is std::begin, which returns an iterator to the first element in a collection. This function is overloaded for the various containers in the STL and for C-style arrays. Using std::begin to obtain such an iterator is a better option than calling my_container.begin() because the latter assumes my_container‘s type provides this function. C-style arrays don’t, for example.

So, customization points help to write code that does not depend on a specific type’s interface. Here is a declaration of a customization point for getting the size of a raster:

template<
    typename Raster>
size_t size(Raster const& raster, size_t dimension);

It takes a raster and an id of a dimension. In the case of 2D rasters, this latter argument must be a 0 or a 1, where 0 represents the first dimension in a 2D array and 1 the second. Since C-style arrays have a row-major ordering, the first dimension iterates over the rows and the second over the columns.

The Raster template argument has no relation with the fern::Raster class template used earlier. We can implement the customization point for our fern::Raster class template as follows, using a partial specialization:

template<
    typename T,
    size_t nr_dimensions>
size_t size(
   fern::Raster<T, nr_dimensions> const& raster,
   size_t dimension)
{
   return raster.shape()[dimension];
}

Within the customization point, we can use our type specific interface. For raster types with other interfaces we can implement alternative overloads. In our algorithm we can now replace the code obtaining the size of the dimensions to be calls to our size customization point.

We need two more customization points: one for obtaining a readable value of a specific cell and one for obtaining a writable reference to a value of a specific cell. Let’s call them both get:

template<
    typename Raster>
value_type<Raster> const& get(Raster const& raster, size_t row, size_t col);

template<
    typename Raster>
value_type<Raster>& get(Raster& raster, size_t row, size_t col);

The return type is calculated by the value_type alias template, which returns the value type of the template argument. Its definition assumes that type traits (see below) provide the value type of every data type:

template<
    typename T>
using value_type = typename TypeTraits<T>::value_type;

For our fern::Raster class template we can implement the additional customization points as follows:

template<
    typename T,
    size_t nr_dimensions>
value_type<fern::Raster<T, nr_dimensions>> const& get(
    fern::Raster<T, nr_dimensions> const& raster,
    size_t row,
    size_t col)
{
    return raster[row][col];
}

template<
    typename T,
    size_t nr_dimensions>
value_type<fern::Raster<T, nr_dimensions>>& get(
    fern::Raster<T, nr_dimensions>& raster,
    size_t row,
    size_t col)
{
    return raster[row][col];
}

The last missing piece is the implementation of the TypeTraits class template. Type traits provide properties of a type. In our case, we need the value type of the Raster class template.

template<
    typename T>
struct TypeTraits
{
};
template<
    typename T,
    size_t nr_dimensions>
struct TypeTraits<
    fern::Raster<T, nr_dimensions>>
{
    using value_type = T;
};

Let’s now rewrite our algorithm in terms of customization points:

#pragma once
#include <cmath>
#include "customization_point.h"


template<
    typename Base,
    typename Exponent,
    typename Result>
inline void pow(
    Base const& base,
    Exponent const& exponent,
    Result& result)
{
    size_t const nr_rows{size(base, 0)};
    size_t const nr_cols{size(base, 1)};

    for(size_t r = 0; r < nr_rows; ++r) {
        for(size_t c = 0; c < nr_cols; ++c) {
            get(result, r, c) = std::pow(get(base, r, c), exponent);
        }
    }
}

Note that the algorithm is written in terms of template parameters and customization points, all to be provided by the caller. We abstracted away any reference to a specific raster type. With this in place, we can now call this algorithm with every raster-like base argument for which we have implemented the three customization points. In the next section we are going to get rid of the limitation that the base argument must be a raster and the exponent a number.

Tag dispatching

If we call our current pow algorithm with a raster instance as the exponent argument, the compilation fails. Our algorithm expects the exponent to be a single number. Passing a raster requires a different implementation. The problem we need to solve is how we can pass different permutations of raster and number arguments to pow, without having to revert to differently named functions like pow_raster_number and pow_raster_raster. Somehow, our pow implementation needs to do something different depending on its template argument types.

Tag dispatching can help us here. What we need is a tag for each data type we support and call a second function that actually implements the algorithm for each unique combination of argument types. Let us first define the tags:

struct number_tag {};
struct raster_tag {};

Now we need a way to obtain the tag given a data type. We can extent our traits class for that:

template<
    typename T>
struct TypeTraits
{
    using tag = number_tag;  // Default.
};


template<
    typename T>
using tag = typename TypeTraits<T>::tag;
template<
    typename T,
    size_t nr_dimensions>
struct TypeTraits<
    fern::Raster<T, nr_dimensions>>
{
    using value_type = T;  // Like before.
    using tag = raster_tag;
};

In our original pow implementation we can now dispatch on data type tag to an algorithm with the correct implementation. Note that this all happens during the compilation, and not at runtime.

#pragma once
#include <cmath>
#include "customization_point.h"


namespace detail {

    template<
        typename Base,
        typename Exponent,
        typename Result>
    inline void pow(
        Base const& base,
        Exponent const& exponent,
        Result& result,
        raster_tag,
        number_tag)
    {
        size_t const nr_rows{size(base, 0)};
        size_t const nr_cols{size(base, 1)};

        for(size_t r = 0; r < nr_rows; ++r) {
            for(size_t c = 0; c < nr_cols; ++c) {
                get(result, r, c) = std::pow(get(base, r, c), exponent);
            }
        }
    }

    template<
        typename Base,
        typename Exponent,
        typename Result>
    inline void pow(
        Base const& base,
        Exponent const& exponent,
        Result& result,
        raster_tag,
        raster_tag)
    {
        size_t const nr_rows{size(base, 0)};
        size_t const nr_cols{size(base, 1)};

        for(size_t r = 0; r < nr_rows; ++r) {
            for(size_t c = 0; c < nr_cols; ++c) {
                get(result, r, c) = std::pow(get(base, r, c),
                    get(exponent, r, c));
            }
        }
    }

}  // namespace detail


template<
    typename Base,
    typename Exponent,
    typename Result>
inline void pow(
    Base const& base,
    Exponent const& exponent,
    Result& result)
{
    using base_tag = tag<Base>;
    using exponent_tag = tag<Exponent>;

    detail::pow(base, exponent, result, base_tag{}, exponent_tag{});
}

More overloads of detail::pow can be implemented if necessary.

With these changes we made it possible for the user to use our algorithm with her own data types. In the next post we are going to take a look at handling invalid data elements.

Leave a Reply

Your email address will not be published. Required fields are marked *