I have a set of elements. There are various ways I would like to partition the set, in the mathematical sense. A partition is defined as a collection of subsets such that every element is in exactly one subset.
I want to represent this situation in a relational database, and I want the relational constraints to enforce the mathematical definition of a partition.
The closest I have gotten is a schema like this:
CREATE TABLE element (
id integer PRIMARY KEY
);
CREATE TABLE partition (
id integer PRIMARY KEY
);
CREATE TABLE subset (
id integer PRIMARY KEY,
partition_id integer REFERENCES partition (id)
);
CREATE TABLE element_subset_mapping (
element_id integer REFERENCES element (id) NOT NULL,
partition_id integer REFERENCES partition (id) NOT NULL,
subset_id integer REFERENCES subset (id) NOT NULL,
CONSTRAINT uq_element_partition UNIQUE (element_id, partition_id)
);
This represents correctly that an element can only occur in one subset for a given partition. But there are at least a couple problems:
- It does not enforce that, within each partition, every element belongs to some subset. I suppose I could take the semantics that a missing (element_id, partition_id) row indicates that the element belongs to a single element subset, but I would prefer to be explicit.
- It does not enforce the requirement that a subset only belongs to one partition. That is, I could have a subset with (id, partition_id) = (22, 37), but still have a element_subset_mapping with (element_id, partition_id, subset_id) = (4005, 42, 22). This should not be allowed because subset 22 belongs to partition 37, not to partition 42.
It feels like there should be a canonical way to do this, but my searches have not yielded any answers.
The requirement that a subset belong to no more than one partition is already being enforced by the primary key constraint on
subset
; however, there is no enforcement that the combination ofsubset_id
andpartition_id
inelement_subset_mapping
reference a corresponding pair insubset
. The following DDL addresses these issues by making the combination of subset and partition unique insubset
and combining the foreign keys onsubset_id
andpartition_id
inelement_subset_mapping
into a single foreign key referencingsubset(id, partition_id)
:The requirement that each element belong to exactly one subset within each partition is a more complex constraint to enforce because it involves rows from multiple tables. Let's begin by creating a trigger function that raises an exception if any elements are missing from any partitions:
This function can then be used to define constraint triggers on each of the tables that could cause the constraint to be violated:
element
,partition
, andelement_subset_mapping
.An important aspect of these triggers is that they are
DEFERRED
, which means that they fire when the current transaction is committed instead of as each row or statement is executed. This allows a set of SQL commands to be executed within a transaction to achieve a state that is consistent with the relational requirements.A requirement that wasn't mentioned in the original post is that all subsets must be non-empty. The following SQL addresses this using the same approach that was used to ensure that all elements are included in every partition: