Implementing equivalence relations in C++ (using boost::disjoint_sets)

3.9k views Asked by At

Assume you have many elements, and you need to keep track of the equivalence relations between them. If element A is equivalent to element B, it is equivalent to all the other elements B is equivalent to.

I am looking for an efficient data structure to encode this information. It should be possible to dynamically add new elements through an equivalence with an existing element, and from that information it should be possible to efficiently compute all the elements the new element is equivalent to.

For example, consider the following equivalence sets of the elements [0,1,2,3,4]:

0 = 1 = 2
3 = 4

where the equality sign denotes equivalence. Now we add a new element 5

0 = 1 = 2
3 = 4 
5

and enforcing the equivalence 5=3, the data structure becomes

0 = 1 = 2
3 = 4 = 5

From this, one should be able to iterate efficiently through the equivalence set for any element. For 5, this set would be [3,4,5].

Boost already provides a convenient data structure called disjoint_sets that seems to meet most of my requirements. Consider this simple program that illustates how to implement the above example:

#include <cstdio>
#include <vector>
#include <boost/pending/disjoint_sets.hpp>
#include <boost/unordered/unordered_set.hpp>

/*
    Equivalence relations
    0 = 1 = 2
    3 = 4
 */

int main(int , char* [])
{
    typedef std::vector<int> VecInt;
    typedef boost::unordered_set<int> SetInt;

    VecInt rank (100);
    VecInt parent (100);
    boost::disjoint_sets<int*,int*> ds(&rank[0], &parent[0]);
    SetInt elements;
    for (int i=0; i<5; ++i) {
        ds.make_set(i);
        elements.insert(i);
    }

    ds.union_set(0,1);
    ds.union_set(1,2);
    ds.union_set(3,4);

    printf("Number of sets:\n\t%d\n", (int)ds.count_sets(elements.begin(), elements.end()));

    // normalize set so that parent is always the smallest number
    ds.normalize_sets(elements.begin(), elements.end());
    for (SetInt::const_iterator i = elements.begin(); i != elements.end(); ++i) {
        printf("%d %d\n", *i, ds.find_set(*i));
    }

    return 0;
}

As seen above one can efficiently add elements, and dynamically expand the disjoint sets. How can one efficiently iterate over the elements of a single disjoint set, without having to iterate over all the elements?

2

There are 2 answers

0
pts On

Most probably you can't do that, disjoint_sets doesn't support iteration over one set only. The underlying data structure and algorithms wouldn't be able to do it efficiently anyway, i.e. even if there was support built in to disjoint_sets for iteration over one set only, that would be just as slow as iterating over all sets, and filtering out wrong sets.

0
violet313 On

Either I am missing something, you forgot to mention something, or maybe you were overthinking this ;)

Happily, equivalence is not equality. For A & B to be equivalent; they only need to share an attribute with the same value. this could be a scalar or even a vector. Anyway, I think your posted requirements can be achieved just using std::multiset and it's std::multiset::equal_range() member function.

//////////////////////////////////////
class E
{
    //could be a GUID or something instead but the time complexity of
    //std::multiset::equal_range with a simple int comparison should be logarithmic
    static size_t _faucet;

public:

    struct LessThan
    {
        bool operator()(const E* l, const E* r) const { return (l->eqValue() < r->eqValue()); }
    };

    using EL=std::vector<const E*>;
    using ES=std::multiset<const E*, E::LessThan>;
    using ER=std::pair<ES::iterator, ES::iterator>;


    static size_t NewValue() { return ++_faucet; }

    ~E() { eqRemove(); }
    E(size_t val) : _eqValue(val) {}
    E(std::string name) : Name(name), _eqValue(NewValue()) { E::Elementals.insert(this); }


    //not rly a great idea to use operator=() for this. demo only..
    const E& operator=(const class E& other) { eqValue(other);  return *this; }

    //overriddable default equivalence interface
    virtual size_t eqValue() const  { return _eqValue; };

    //clearly it matters how mutable you need your equivalence relationships to be,,
    //in this implementation, if an element's equivalence relation changes then
    //the element is going to be erased and re-inserted.
    virtual void   eqValue(const class E& other)
    {
        if (_eqValue == other._eqValue) return;

        eqRemove();
        _eqValue=other._eqValue;
        E::Elementals.insert(this);
    };

    ES::iterator eqRemove() 
    {
        auto range=E::Elementals.equal_range(this);
        //worst-case complexity should be aprox linear over the range
        for (auto it=range.first; it!=range.second; it++)
            if (this == (*it))
                return E::Elementals.erase(it);            
        return E::Elementals.end();
    }

    std::string Name; //some other attribute unique to the instance
    static ES   Elementals; //canonical set of elements with equivalence relations

protected:
    size_t _eqValue=0;

};

size_t E::_faucet=0;
E::ES  E::Elementals{};


//////////////////////////////////////
//random specialisation providing
//dynamic class-level equivalence
class StarFish : public E
{

public:

    static void EqAssign(const class E& other)
    {
        if (StarFish::_id == other.eqValue()) return;

        E el(StarFish::_id);
        auto range=E::Elementals.equal_range(&el);

        StarFish::_id=other.eqValue();

        E::EL insertList(range.first, range.second);
        E::Elementals.erase(range.first, range.second);
        E::Elementals.insert(insertList.begin(), insertList.end());
    }


    StarFish() : E("starfish") {}

    //base-class overrides
    virtual size_t eqValue() const { return StarFish::_id; };

protected: //equivalence is a the class level
    virtual void eqValue(const class E& other) { assert(0); }

private:
    static size_t _id;
};

size_t StarFish::_id=E::NewValue();


//////////////////////////////////////
void eqPrint(const E& e)
{
    std::cout << std::endl << "elementals equivalent to " << e.Name << ": ";
    auto range=E::Elementals.equal_range(&e);
    for (auto it=range.first; it!=range.second; it++)
        std::cout << (*it)->Name << " ";
    std::cout << std::endl << std::endl;
}

//////////////////////////////////////
void eqPrint()
{
    for (auto it=E::Elementals.begin(); it!=E::Elementals.end(); it++)
        std::cout << (*it)->Name << ": " << (*it)->eqValue() << "  ";
    std::cout << std::endl << std::endl;
}

//////////////////////////////////////
int main()
{
    E e0{"zero"}, e1{"one"}, e2{"two"}, e3{"three"}, e4{"four"}, e5{"five"};

    //per the OP
    e0=e1=e2;
    e3=e4;
    e5=e3;

    eqPrint(e0);
    eqPrint(e3);
    eqPrint(e5);
    eqPrint();

    StarFish::EqAssign(e3);
    StarFish starfish1, starfish2;
    starfish1.Name="fred";

    eqPrint(e3);

    //re-assignment
    StarFish::EqAssign(e0);
    e3=e0;

    {   //out of scope removal
        E e6{"six"};
        e6=e4;
        eqPrint(e4);
    }

    eqPrint(e5);
    eqPrint(e0);
    eqPrint();

    return 0;
}

online demo

NB: C++ class inheritance also provides another kind of immutable equivalence that can be quite useful ;)