How does the pumping length of a regular language relate to the pumping length of a related language. For example, if A :< B :< C are all regular languages and k is the pumping length of B, do we know anything about the pumping lengths of A and C?
One might be inclined naively to think that a sublanguage has a smaller (<=) pumping length when we look at finite languages. {a,ab,abc} :< {a,ab,abc,abcd} have respective pumping lengths 4 <= 5. Taking an element out of a set can't make its longest word even longer.
On the other hand if you look at the state machine formed by the synchronized product of two languages, the intersection language and the union language have the same state machine structure, but differ in that the set of final states of the intersection is a subset of the set of final states of the union. Having more final states, could make it more probable to find a shorter path through the state machine. But on the contrary having fewer final states makes it more likely that the state machine has non-co-accessible states, and is thus reducible.
Note first that all languages over an alphabet are related to all other languages over that alphabet by some relation, however contrived. We really do need to limit discussion, as you have suggested, to something like
subset
in order to meaningfully scope the question.Now, you've already correctly noted that in the general case, the
subset
relation doesn't have a clear bearing on the pumping lengths of relative languages. You can easily takea*
and compare to{a^n}
and show that a minimal DFA fora*
is almost always simpler than one for{a^n}
.Let us further restrict ourselves to languages that differ by a finite number of entries; that is,
L \ R
is finite andR \ L
is finite. This is an indicator of similarity different fromsubset
; but if we require that, w.l.o.g., thatR \ L
be empty, then we recover a restricted version ofsubset
. We might now ask the same question given this modified version: for languages that differ in a finite number of entries, does thesubset
relation tell us anything?The answer is still no. Consider
L = a*
andR = a* \ A
, whereA
is a finite non-empty subset ofa*
. Even still,L
takes one state andR
takes potentially many more.Restricting ourselves to finite sets only, as you suggest, does let us deduce what you propose: that a minimal automaton for
R
will have no more states than the one forL
. Why is this? We must haven+1
states to accept any string of lengthn
, and we must have a dead state to accept strings not in the language (of which there will be infinitely many). A minimal DFA will have no loops at all (except centering on the dead state) if the language is finite, since otherwise you'd be able to get infinitely many strings.Your observation about taking the Cartesian product is correct, in that the result of applying the construction gives a structurally identical DFA for any set operation (union, difference, intersection, etc.); however, these DFAs are not guaranteed to be minimal and the point for finite languages still holds, namely, the DFA for intersection will have no more states than the one for union, before and after DFA minimization of both machines.
You might be able to take the Myhill-Nerode theorem and define a notion of "close-relatedness" using that which does allow you to compare languages to determine which will have the larger minimal DFA. I am not sure about that, however, since an easy way of doing that would allow you to compare to parameterized reference languages to, for instance, easily prove any non-regular language is non-regular, which might be a big deal in mathematics in its own right (like, either it's impossible in general or it would prove P=NP, etc. to do so).