I have some Things that have 0-* names in any number of languages:
CREATE TABLE Things(
thing_id INT PRIMARY_KEY,
thing_proprty INT);
CREATE TABLE ThingNames(
thing_name_id INT PRIMARY_KEY,
thing_id INT,
language CHAR[2],
name VARCHAR[64] UNIQUE,
FOREIGN KEY (thing_id) REFERENCES Things(thing_id));
The things are relevant in a number of Fields, and in each field they have 0-1 CanonicalNames per language. The straightforward way is
CREATE TABLE Fields(
field_id INT PRIMARY_KEY,
name VARCHAR[64])
CREATE TABLE CanonicalNames(
thing_id INT
field_id INT
canonical_name_id INT
FOREIGN KEY (thing_id) REFERENCES Things(thing_id),
FOREIGN KEY (field_id) REFERENCES Fields(field_id),
FOREIGN KEY (canonical_name_id) REFERENCES ThingNames(thing_name_id));
But this misses the 0-1 constraint, which would be a uniqueness constraint on field_id together with the thing_id and language columns of ThingNames as referenced by canonical_name_id. Including all columns as foreign keys in CanonicalNames is of course redundant and error prone, so is there a way to impose uniqueness constraints across tables? Or is there a better solution here that I am not seeing?
I'm unsure about a couple of things in your design. Declaring ThingNames.name as key means that the same thing can't have the same name in two different languages, but that seems likely to happen with related languages (e.g. Norwegian and Danish), or when technical terms are left untranslated.
And the concept of relevance isn't explicitly represented in your schema. Is a Thing relevant to a Field only if it has at least one Canonical Name (for some language)?
However, making some assumptions, I'd suggest this model (Dataphor-based pseudocode, data types omitted):
CanonicalName isn't in BCNF, because of the FD
{ Canonicalname, Language } -> { ThingId }
, but the redundancy is controlled by the reference CanonicalName_ThingName. (You could call it a foreign key, but it is actually a foreign superkey.) This is not a bug, it's how you ensure that the canonical name is one of the thing names. (I'm assuming that is a rule.) In this design, having a Language column in CanonicalName isn't redundant, and it enables the 0-1 constraint you're missing.This design allows multiple Things to have the same name in different languages, but also allows different Things to have the same name in different languages. For instance, "kjole" means "dress" both in Norwegian and Danish, but "dress" in Norwegian means "suit" in English. Let me know if this should be disallowed, and I'll update the design.
The Relevance table may (or perhaps should) be omitted if there is a rule saying that a Thing is relevant to a Field if and only if it has at least one Canonical Name for that Field. Then CanonicalName would have to reference Field instead of Relevance, of course.