Compile time generated function dispatcher with minimal overhead

1.3k views Asked by At

I'm trying to implement a fast function dispatcher using compile time generated arrays to being able to use it at runtime in O(1).

Some lines of code just to clarify:

template<int i>
void f()
  {
  // do stuff 
  }

// specialized for every managed integer 
template<>
void f<1>
{
// do stuff
}

Dispatcher<1,5,100,300> dispatcher;  
dispatcher.execute(5); // this should call f<5>()

Let's call N the number of inputs of the dispatcher (4 in this case) and M the maximum value of the dispatcher input (300) in this case.

I've been able to make it work creating an array with size equal to M. This exploits the fact that at runtime you can do something like:

dispatcher.execute(5) -> internalArray[5]();

This works of course, but it is not feasible for arrays of big dimensions.

The best thing would be to generate an array of N elements only, and do some math trick to transform the input index into the index of the second array.

In the example, something that translates 1,5,100,300 respectively into 0,1,2,3. I have been able to do a kind of pre-processing method to transform them, but I'm looking for a way to avoid this step.

In other words I think I'm looking for some kind of minimal perfect hashing that can be used at compile time for my specific case in a very efficient way (ideally without any overhead, something like: goto: MyInstruction).

I'm not looking for alternatives that use virtual functions, std::map or complex operations.

Please ask if there is something is not clear.

PS I'm using C++11 but any idea is welcome

[Edit] I'm aware of the labels as values language extension of GCC. With those I would be maybe able to achieve my goal, but need a portable solution.

5

There are 5 answers

9
bolov On

Well, I don't know if you are going to be able to do what you want. Make a code that creates a perfect hash function for any input seems to me pretty ... not doable.

Anyway, here is a simple solution to write the code. It's C++17 but can be made to work with C++11 with a bit of trickery.

template<int i> void f();

template <int... Is>
struct Dispatcher
{
    template <int I> constexpr auto execute_if(int i)
    {
        if  (I == i)
            f<I>();
    }

    constexpr auto execute(int i)
    {
        (execute_if<Is>(i), ...);
    }
};

auto test()
{
    Dispatcher<1,5,100,300> dispatcher;  
    dispatcher.execute(5);
}

The above code translates to just a simple jump because 5 is a compile time constant:

test():                               # @test()
        jmp     void f<5>()            # TAILCALL

If the argument is a runtime variable, then it does a series of compares:

auto test(int i)
{
    Dispatcher<1,5,100,300> dispatcher;  
    dispatcher.execute(i);
}
test(int):                               # @test(int)
        cmp     edi, 99
        jg      .LBB0_4
        cmp     edi, 1
        je      .LBB0_7
        cmp     edi, 5
        jne     .LBB0_9
        jmp     void f<5>()            # TAILCALL
.LBB0_4:
        cmp     edi, 100
        je      .LBB0_8
        cmp     edi, 300
        jne     .LBB0_9
        jmp     void f<300>()          # TAILCALL
.LBB0_9:
        ret
.LBB0_7:
        jmp     void f<1>()            # TAILCALL
.LBB0_8:
        jmp     void f<100>()          # TAILCALL

The solution can be improved to perform a binary search, but it is not trivial.

8
Paul Sanders On

Building on @bolov's answer, it's possible to use an arbitrary dispatch algorithm when i is not a constant by changing:

constexpr auto execute(int i)
{
    (execute_if<Is>(i), ...);
}

To:

constexpr auto execute(unsigned i)
{
    (execute_if<Is>(i), ...);
}

And then adding:

constexpr auto execute (int& i)
{
    // Add arbitrary dispatch mechanism here
}

Complete example, C++11 compatible and using a rather clunky std::map (worst case complexity log n) when i is not a constant (I dropped the constexpr stuff to make life easy in C++11):

#include <map>
#include <iostream>

std::map <int, void (*) ()> map;

template <int i> void f ();
template <> void f <1> () { std::cout << "f1\n"; }
template <> void f <2> () { std::cout << "f2\n"; }
template <> void f <3> () { std::cout << "f3\n"; }
template <> void f <4> () { std::cout << "f4\n"; }
template <> void f <5> () { std::cout << "f5\n"; }

template <int ... Is>
struct Dispatcher
{
    template <int first> void execute_if (int i)
    {
        if (first == i)
        {            
            std::cout << "Execute f" << i << " via template\n";
            f <first> ();
        }
    }

    template <int first, int second, int... rest> void execute_if (int i)
    {
        if (first == i)
        {            
            std::cout << "Execute f" << i << " via template\n";
            f <first> ();
        }
        else
            execute_if <second, rest...> (i);
    }

    void execute (unsigned i)
    {
        execute_if <Is...> (i);
    }

    void execute (int& i)
    {
        std::cout << "Execute f" << i << " via map\n";
        map.at (i) ();
    }
};

int main()
{
    map [1] = f <1>;
    map [2] = f <2>;
    map [3] = f <3>;
    map [4] = f <4>;
    map [5] = f <5>;

    Dispatcher <1, 2, 4> dispatcher;  
    dispatcher.execute (2);
    int i = 4;
    dispatcher.execute (i);
}

Output:

Execute f2 via template
f2
Execute f4 via map
f4

Live Demo


Edit: as per the OP's request, here's a version using a binary search instead of std::map. The key to this is building the array to be searched in the Dispatcher constructor.

#include <vector>
#include <iostream>

template <int i> void f ();
template <> void f <1> () { std::cout << "f1\n"; }
template <> void f <2> () { std::cout << "f2\n"; }
template <> void f <3> () { std::cout << "f3\n"; }
template <> void f <4> () { std::cout << "f4\n"; }
template <> void f <5> () { std::cout << "f5\n"; }

using ve = std::pair <int, void (*) ()>;

template <int ... Is>
struct Dispatcher
{
    template <int first> void execute_if (int i)
    {
        if (first == i)
        {            
            std::cout << "Execute f" << i << " via template\n";
            f <first> ();
        }
    }

    template <int first, int second, int... rest> void execute_if (int i)
    {
        if (first == i)
        {            
            std::cout << "Execute f" << i << " via template\n";
            f <first> ();
        }
        else
            execute_if <second, rest...> (i);
    }

    void execute (unsigned i)
    {
        execute_if <Is...> (i);
    }

    void execute (int& i)
    {
        std::cout << "Execute f" << i << " via binary search\n";
        auto lb = lower_bound (indexes.begin (), indexes.end (), ve (i, nullptr), 
            [] (ve p1, ve p2) { return p1.first < p2.first; });    
        if (lb != indexes.end () && lb->first == i)
            lb->second ();
    }

    template <int first> void append_index ()
    {
        indexes.emplace_back (ve (first, f <first>));
    }

    template <int first, int second, int... rest> void append_index ()
    {
        append_index <first> ();
        append_index <second, rest...> ();
    }

    Dispatcher ()
    {
        append_index <Is...> ();
    }

private:
    std::vector <ve> indexes;
};

int main()
{
    Dispatcher <1, 2, 4> dispatcher;  
    dispatcher.execute (2);
    int i = 4;
    dispatcher.execute (i);
}

Live demo

4
max66 On

The OP asked for a C++11 solution, that maintain the constexpres-ness, following the bolov's solution example.

Well... I don't if it's a good idea because a constexpr function/member in C++11 need to be recursive and return a value. And compilers pose strict limits to template recursion and this can be a problem if N (the sizeof...(Is)) is high.

Anyway... the best I can imagine is the following

template <int... Is>
struct Dispatcher
{
    template <typename = void>
    constexpr int execute_h (int) const
     { /* wrong case; exception? */ return -1; }

    template <int J0, int ... Js>
    constexpr int execute_h (int i) const
     { return J0 == i ? (f<J0>(), 0) : execute_h<Js...>(i); }

    constexpr int execute (int i) const
     { return execute_h<Is...>(i); }
};

that can be used, computing f<>() compile-time, as follows

void test()
{
    constexpr Dispatcher<1,5,100,300> dispatcher;  
    constexpr auto val1 = dispatcher.execute(5);
    constexpr auto val2 = dispatcher.execute(6);

    std::cout << val1 << std::endl; // print 0 (5 is in the list)
    std::cout << val2 << std::endl; // print -1 (6 isn't in the list)
}

also f<>() must be constexpr and, in C++11, can't return void; I've used the following

template <int i>
constexpr int f ()
 { return i; }
2
max66 On

A little improvement (IMHO) for bolov's solution

Writing execute_if to return true, when f<I>() is executed, or false, otherwise

template <int I>
constexpr auto execute_if (int i) const
{ return I == i ? f<I>(), true : false; }

instead of using the comma operator for template folding

template <int ... Is>
constexpr auto execute(int i) const
 { (execute_if<Is>(i), ...); }

we can use the or (||) operator

template <int ... Is>
constexpr auto execute(int i) const
 { (execute_if<Is>(i) || ...); }

Using the comma operator, execute_is<Is>(i) is called for ever Is, also when the first Is is equal to i; using the || we have short circuiting, that is that execute_is<Is>(i) is called only until we get a Is that is equal to i.

0
Thomas Mueller On

In theory(!) you could create a perfect hash function using C++ templating.

  • This question has code on how to create a perfect hash function (using brute force, so it only works for relatively small sets).
  • This question shows that C++ templating is Turing complete, so it should be possible to convert the above code to C++ templates.
  • It might even be possible with C preprocessor, as it doesn't need an endless loop.

But I assume doing it is rather hard.