How should one proceed to prove (or find) if two regular expressions are same or equivalent?

19.4k views Asked by At

For example, in an assignment given to me, we were asked to find out if two regular expressions are equal or not.

(a+b+c)*  and ((ab)**c*)*

My question is how is one supposed to do that? If I draw the transition graphs for both and then run a few strings through it and show that both of the TGs are able to accept it, is that a sufficient proof ? If not, how do I do it? Is there a mathematical/axiomatic approach towards this?

Thanks in advance.

EDIT: There is another thing that I'd like to clear which is kind of related to this question. Are the two FAs depicted in the photo below the same?

enter image description here

i.e. Are (1) and (2) in the above picture the same?

6

There are 6 answers

1
Patrick87 On BEST ANSWER

There is an algorithm to determine whether they are equal:

  1. Construct NFA-lambdas corresponding to each RE using Kleene's theorem
  2. Construct DFAs for each using the subset/powerset construction
  3. (optional) Minimize the DFAs using a standard DFA minimization algorithm.
  4. Construct DFAs for L(M1) \ L(M2) and L(M2) \ L(M1) using the Cartesian Product Machine construction
  5. (Optional) Minimize these CPMs.
  6. Determine whether each one accepts any strings by testing all strings over alphabet E of size no greater than |Q| (works due to the pumping lemma for regular languages)

No novelty or genius is required; you could write a program to do this (although, in practice, using the powerset construction can be unwieldy, and failing to minimize at both steps can be costly).

EDIT: Yes, those DFAs are the same. The first is just a shorthand notation for the second.

0
Shiplu Mokaddim On

Assuming

  1. Spaces are inserted for illustration
  2. ( ( a b )* * c* )* is actually ((ab)*c*)**,
  3. Each pattern is wrapped by ^ and $.

Those regular expressions are NOT same.

abccabcc will not match (a+b+c)* but will match ((ab)*c*)*

How did I find this?

When I closely looked at those patterns, I found 2 things.

  1. First one accepts more than 1 of a and b {1,}. So there will be always a sequence of a and sequence of b side by side. like aaaabb, aabbbbb, etc. But in the second pattern a and be will be side by side with single instance. like ab, ababab, abababab, etc.
  2. c will appear only 1 time following sequence of a and sequence of b. But in second pattern c can appear as many times as it can.
0
Lucero On

They are different, which is easy to tell by the quantifiers. For the first expression to match anything, it must contain a c. The second can obviously do without a c. (There are many more differences, but that should get you started).

1
nachito On

Two regular expressions R and T are equivalent if the language defined by R (i.e., the set of strings generated by regular expression R) is equal to the language defined by T. To prove equivalences for regular expressions, we use containment proofs from set theory. That is, if S1 is the set of strings generated by regular expression R, and S2 is the set of strings generated by regular expression T, we must prove that S1 ⊆ S2 and S2 ⊆ S1. Both directions are necessary to prove equality of sets.

-- From the lecture notes of CSc 4340 GSU Fall 09 (Dr. Raj Sunderraman)

0
LaC On

Since this is homework, I won't give you the complete answer, but I will tell you the key fact you need to know: for a given finite state language, the DFA which recognizes it with the minimum number of states is unique.

BTW, I don't believe that your professor would assign this homework without teaching you how to do it. Get off the internet and read your lecture notes and/or textbook.

1
Vikram On

((ab)^^c^)^=( a^b^c^)^ = (a+b+c)^