I have two OrderedSets and I'm trying to check whether one in a subset of the other - both the elements and their order is important. However, the orderedset package is giving me strange results.
>>> import orderedset
>>> a = orderedset.OrderedSet([433, 316, 259])
>>> b = orderedset.OrderedSet([433, 316, 69])
>>> a.issuperset(b)
True
This doesn't make any sense to me because b
contains a value (69
) that is definitely not in a
. Why is a
a superset
of b
then?
However, when I try this:
>>> c = orderedset.OrderedSet([1, 2, 3])
>>> d = orderedset.OrderedSet([1, 2, 4])
>>> c.issuperset(d)
False
This behaviour seems inconsistent to me: why should the choice of values - [433, 316, 259]
versus [1, 2, 3]
- in the OrderedSet affect the output of issuperset()
?
Perhaps there's a better way to do this? I need to know whether the elements in b
are contained in a
in the same order. Meaning that, if
a = OrderedSet([433, 316, 259])
I'm looking for partial matches within that set that start with the same starting value as a
(433
). This is what I want out of it:
OrderedSet([433, 316, 259])
OrderedSet([433, 316]])
OrderedSet([433])
And not:
OrderedSet([433, 259])
OrderedSet([316, 259])
OrderedSet([433, 316, 69])
OrderedSet([433, 259, 316])
OrderedSet([259, 433])
...
Basically, if this really confusing - I have an ordered set and I'm trying to find partial matches both in terms of the values and their order.
Presumably you are using this third party module since Python doesn't have a built-in orderedset.
A quick look through the source code on github shows that the
issuperset
function is implemented asTaking a look at how the less than or equal operator is defined for orderedset:
So essentially, when comparing two ordered sets they are first converted to lists, and then the Python built-in
<=
is used to compare the two lists. When you compare two lists using<=
, it's similar to a lexical string comparison meaning it compares matching indices of both lists.According to their implementation,
[433, 316, 259]
is a superset of[433, 316, 69]
(all matching indices of the first list are greater that or equal of the second list), whereas[433, 316, 259]
would not be a superset of[433, 316, 260]
(the second list has a bigger value in its last index than the first list).What they probably wanted to write was
Which would use the
<=
defined for the built-in Python sets, and properly test for subsets and supersets.You may want to report this issue here.