According to the MLton documentation:
Standard ML requires types to be defined before they are used. [link]
Not all implementations enforce this requirement (for example, SML/NJ doesn't), but the above-linked page makes a good case for why it might be needed for soundness (depending on how the implementation handles the value restriction), and it accords with some of the commentary in the Definition:
Although not assumed in our definitions, it is intended that every context C = T, U, E has the property that tynames E ⊆ T. Thus T may be thought of, loosely, as containing all type names which "have been generated". […] Of course, remarks about what "has been generated" are not precise in terms of the semantic rules. But the following precise result may easily be demonstrated:
Let S be a sentence T, U, E ⊢ phrase ⇒ A such that tynames E ⊆ T, and let S′ be a sentence T′, U′, E′ ⊢ phrase′ ⇒ A′ occurring in a proof of S; then also tynames E′ ⊆ T′.
[page 21]
But I'm doubly confused by this.
Firstly — the above theorem seems backward. If I correctly understand the phrase "occurring in a proof of S", then this seems to mean (by contrapositive) "once you have a context that violates the intention that tynames E ⊆ T, all subsequent contexts will also violate that intention". Even if that's true, it seems that it would be much more useful and meaningful to assert the converse, namely, "if all contexts so far conform to the intention that tynames E ⊆ T, then any subsequently inferable context will also conform to that intention". No?
Secondly — neither MLton's statement nor the Definition's statement actually seems to be supported by the inference rules (or the "Further Restrictions" that follow them). A few inference rules have "tynames τ ⊆ T of C" or "tynames VE ⊆ T of C" as a side-condition, but none of those rules is needed for this program (given in the above-linked documentation):
val r = ref NONE
datatype t = A | B
val () = r := SOME A
(Specifically: rule (4) has to do with let
, rule (14) with =>
, and rule (26) with rec
. None of those is used in this program.)
And coming at it from the other direction, rule (17), which covers datatype
declarations, requires only that the generated type names not be in T of C; so it doesn't prevent the generation of a type name used in the existing value environment (except insofar as it's already true that tynames VE ⊆ T of C).
I feel like I'm probably missing something pretty basic here, but I have no idea what it could be!
Regarding your first question, I'm not sure why you suggest that reading. The result basically says that if you have a derivation S (think of it as a tree) whose context satisfies the condition, then all its subderivations (think subtrees) will have contexts that also satisfy the condition. In other words, all rules maintain the condition. Think of the condition as the well-formedness requirement for contexts C.
Regarding your second question, note the use of ⊕ in the sequencing rule (24), which extends T of C as needed. More concretely, if
r
was assigned typet option ref
, then the first declaration would produce an environment E1 with the corresponding t ∈ tynames E1. Then, according to the sequencing rule (24), the second declaration would have to be elaborated under the context C' = C ⊕ E1, which is defined as C + (tynames E1, E1) in Section 4.3. Hence, t ∈ T of C', as required for well-formedness, and consequently, rule (17) would not be able to pick the same t as the denotation oft
.