How to serialize complex data structures as coded constexpr initializations in C++20

308 views Asked by At

The goal is to serialize large data structures as constexpr initializations, so they will be part of the .text or .rodata segment and can be used for consteval, constexpr etc.

The data structures contain containers for cross-referencing/indexing etc. All this code will be generated, so we can compute container state inside the code generator using the same or equivalent container logic, then serialize the container state as is.

C++ does not support non-transient dynamic allocation for constexpr.

"In Kona 2019, this approach was considered too brittle, and as a result non-transient allocation was removed from the feature set." http://open-std.org/JTC1/SC22/WG21/docs/papers/2019/p0784r7.html

Therefore, the use of std:: or self-made dynamic containers is not possible.

The use of (variadic?) templates seems out of the question too, as the data structures must be usable in a runtime-dynamic manner i.e. any use of templates would require a type erasure layer on top, which seems unwieldy. See https://www.artima.com/articles/on-the-tension-between-object-oriented-and-generic-programming-in-c

The next attempt, as explored here, is trying to serialize as compound literals. See https://en.cppreference.com/w/c/language/compound_literal

Standard C++20 doesn't actually support compound literals officially, but we can rely on compiler extensions to supports this C feature inside C++. Seems to be available in all major C++ compilers.

The example below is just a very small proof of concept (or attempt at such).

Problem: The following works on Clang but not on GCC. Why? Is there a different way?

#include <memory>

template <typename K, typename V>
struct Entry{
    K key;
    V value;
};

template <typename K, typename V>
struct Bucket{
    size_t n;
    // C++ cannot initialize a flexible array in a struct, use a pointer.
    Entry<K, V> *entries; 
};

template <typename K, typename V>
struct HashMap{
    using bucket_t = Bucket<K, V>;
    using entry_t = Entry<K, V>;
    size_t n;
    // C++ cannot initialize a flexible array in a struct, use a pointer.
    bucket_t *buckets; 

    constexpr V get(K key) const {
        const size_t i = key % n;
        const auto &bucket = buckets[i];
        for (size_t j = 0; j < bucket.n; ++j) {
            if (bucket.entries[j].key == key) {
                return bucket.entries[j].value;
            }
        }
        return {};
    }
};

using MyHashMap = HashMap<int, double>;
constexpr MyHashMap map =
{
        2,
        // Because we can't initialize a flexible array, we need to employ a so-called 
        // compound literal and let it decay to a pointer. 
        (MyHashMap::bucket_t[]) 
        {
                { 
                    3, 
                    (MyHashMap::entry_t[]) // dito
                    { {0, 1.2}, {2, 2.4}, {6, 3.6} } 
                },
                { 
                    4, 
                    (MyHashMap::entry_t[]) // dito
                    { {1, 0.0}, {5, 2.0}, {11, 4.0}, {13, 1.0} } 
                }
        }
};

// Serves as proof of concept by succeeding as consteval.
consteval int proofOfConcept() {
    // Should return 76 as a constant, 
    return 10*(map.get(0) + map.get(11) + map.get(2));
}

int main() {
    return proofOfConcept();
}

On GCC this fails.

<source>: In function 'int main()':
<source>:94:26:   in 'constexpr' expansion of 'proofOfConcept()'
<source>:90:23:   in 'constexpr' expansion of 'map.HashMap<int, double>::get(0)'
<source>:94:26: error: the value of '._anon_115' is not usable in a constant expression
   94 |     return proofOfConcept();
      |            ~~~~~~~~~~~~~~^~
<source>:84:9: note: '._anon_115' was not declared 'constexpr'
   84 |         }
      |         ^

Explore it here:

https://godbolt.org/z/nMKnoY6fY

1

There are 1 answers

4
markus On

As 康桓瑋 pointed out, it's a lifetime issue. It seems each nested array literal needs to be defined as a symbol separately and then used in its place. This avoids the non-standard compound literal usage too.

But it adds the problem of transitive constness. We need to be able to parametrize the containing structures with constness.

So this seems to be a solution:

#include <memory>

template<bool isConst, typename T>
struct TransitiveConst { using Result = T; };

template<typename T>
struct TransitiveConst<true, T> { using Result = const T; };

template <typename K, typename V>
struct Entry{
    K key;
    V value;
};

template <typename E>
struct Bucket{
    size_t n;
    // C++ cannot initialize a flexible array in a struct, use a pointer.
    E *entries;
};

template <bool isConst, typename K, typename V>
struct HashMap{
    using entry_t = typename TransitiveConst<isConst, Entry<K, V>>::Result;
    using bucket_t = typename TransitiveConst<isConst, Bucket<entry_t>>::Result;
    size_t n;
    // C++ cannot initialize a flexible array in a struct, use a pointer.
    bucket_t *buckets;

    constexpr V get(K key) const {
        const size_t i = key % n;
        const auto &bucket = buckets[i];
        for (size_t j = 0; j < bucket.n; ++j) {
            if (bucket.entries[j].key == key) {
                return bucket.entries[j].value;
            }
        }
        return {};
    }
};

using MyHashMap = HashMap<true, const int, const double>;

constexpr MyHashMap::entry_t entry0[] = { {0, 1.2}, {2, 2.4}, {6, 3.6} };
constexpr MyHashMap::entry_t entry1[] = { {1, 0.0}, {5, 2.0}, {11, 4.0}, {13, 1.0} };
constexpr MyHashMap::bucket_t buckets[] = {
        {
            std::size(entry0),
            entry0
        },
        {
            std::size(entry1),
            entry1
        }
};

constexpr MyHashMap map =
{
        std::size(buckets),
        buckets
};

// Serves as proof of concept by succeeding as consteval.
consteval int proofOfConcept() {
    // Should return 76 as a constant,
    return (10*(map.get(0) + map.get(11) + map.get(2)));
}

int main() {
    return proofOfConcept();
}

https://godbolt.org/z/fbfTdGxE7