I have to store DOIs in a MySQL database. The handbook says:
There is no limitation on the length of a DOI name.
So far, the maximum length of a DOI in my current data is 78 chars. Which field length would you recommend in order to not waste storage space and to be on the safe side? In general:
How do you handle the problem of not knowing the maximum length of input data that has to be stored in a database, considering space and the efficiency of transactions?
EDIT
There are these two (simplified) tables document
and topic
with a one-to-many relationship:
CREATE TABLE document
(
ID int(11) NOT NULL,
DOI ??? NOT NULL,
PRIMARY KEY (ID)
);
CREATE TABLE topic
(
ID int(11) NOT NULL,
DocID int(11) NOT NULL,
Name varchar(255) NOT NULL,
PRIMARY KEY (ID),
FOREIGN KEY (DocID) REFERENCES Document(ID), UNIQUE(DocID)
);
I have to run the following (simplified) query for statistics, returning the total value of referenced topic-categories per document (if there are any references):
SELECT COUNT(topic.Name) AS number, document.DOI
FROM document LEFT OUTER JOIN topic
ON document.ID = topic.DocID
GROUP BY document.DOI;
The character set used is utf_8_general_ci
.
TEXT
andVARCHAR
can store 64KB. If you're being extra paranoid, useLONGTEXT
which allows 4GB, though if the names are actually longer than 64KB then that is a really abusive standard.VARCHAR(65535)
is probably a reasonable accommodation.Since
VARCHAR
is variable length then you really only pay for the extra storage if and when it's used. The limit is just there to cap how much data can, theoretically, be put in the field.