Relation of pumping lengths between related regular languages

129 views Asked by At

How does the pumping length of a regular language relate to the pumping length of a related language. For example, if A :< B :< C are all regular languages and k is the pumping length of B, do we know anything about the pumping lengths of A and C?

One might be inclined naively to think that a sublanguage has a smaller (<=) pumping length when we look at finite languages. {a,ab,abc} :< {a,ab,abc,abcd} have respective pumping lengths 4 <= 5. Taking an element out of a set can't make its longest word even longer.

On the other hand if you look at the state machine formed by the synchronized product of two languages, the intersection language and the union language have the same state machine structure, but differ in that the set of final states of the intersection is a subset of the set of final states of the union. Having more final states, could make it more probable to find a shorter path through the state machine. But on the contrary having fewer final states makes it more likely that the state machine has non-co-accessible states, and is thus reducible.

1

There are 1 answers

0
Patrick87 On

Note first that all languages over an alphabet are related to all other languages over that alphabet by some relation, however contrived. We really do need to limit discussion, as you have suggested, to something like subset in order to meaningfully scope the question.

Now, you've already correctly noted that in the general case, the subset relation doesn't have a clear bearing on the pumping lengths of relative languages. You can easily take a* and compare to {a^n} and show that a minimal DFA for a* is almost always simpler than one for {a^n}.

Let us further restrict ourselves to languages that differ by a finite number of entries; that is, L \ R is finite and R \ L is finite. This is an indicator of similarity different from subset; but if we require that, w.l.o.g., that R \ L be empty, then we recover a restricted version of subset. We might now ask the same question given this modified version: for languages that differ in a finite number of entries, does the subset relation tell us anything?

The answer is still no. Consider L = a* and R = a* \ A, where A is a finite non-empty subset of a*. Even still, L takes one state and R takes potentially many more.

Restricting ourselves to finite sets only, as you suggest, does let us deduce what you propose: that a minimal automaton for R will have no more states than the one for L. Why is this? We must have n+1 states to accept any string of length n, and we must have a dead state to accept strings not in the language (of which there will be infinitely many). A minimal DFA will have no loops at all (except centering on the dead state) if the language is finite, since otherwise you'd be able to get infinitely many strings.

Your observation about taking the Cartesian product is correct, in that the result of applying the construction gives a structurally identical DFA for any set operation (union, difference, intersection, etc.); however, these DFAs are not guaranteed to be minimal and the point for finite languages still holds, namely, the DFA for intersection will have no more states than the one for union, before and after DFA minimization of both machines.

You might be able to take the Myhill-Nerode theorem and define a notion of "close-relatedness" using that which does allow you to compare languages to determine which will have the larger minimal DFA. I am not sure about that, however, since an easy way of doing that would allow you to compare to parameterized reference languages to, for instance, easily prove any non-regular language is non-regular, which might be a big deal in mathematics in its own right (like, either it's impossible in general or it would prove P=NP, etc. to do so).