Clarify search algorithms in different minikanren implementation

336 views Asked by At

I am currently learning miniKanren by The Reasoned Schemer and Racket.

I have three versions of minikanren implementation:

  1. The Reasoned Schemer, First Edition (MIT Press, 2005). I called it TRS1

    https://github.com/miniKanren/TheReasonedSchemer

    PS. It says that condi has been replaced by an improved version of conde which performs interleaving.

  2. The Reasoned Schemer, Second Edition (MIT Press, 2018). I called it TRS2

    https://github.com/TheReasonedSchemer2ndEd/CodeFromTheReasonedSchemer2ndEd

  3. The Reasoned Schemer, First Edition (MIT Press, 2005). I called it TRS1*

    https://docs.racket-lang.org/minikanren/

I have did some experiments about the three implementations above:

1st experiment:

TRS1

(run* (r)
      (fresh (x y)
             (conde
              ((== 'a x) (conde
                          ((== 'c y) )
                          ((== 'd y))))
              ((== 'b x) (conde
                          ((== 'e y) )
                          ((== 'f y)))))
             (== `(,x ,y) r)))

;; => '((a c) (a d) (b e) (b f))

TRS2

(run* (x y)
      (conde
       ((== 'a x) (conde
                   ((== 'c y) )
                   ((== 'd y))))
       ((== 'b x) (conde
                   ((== 'e y) )
                   ((== 'f y))))))
;; => '((a c) (a d) (b e) (b f))  

TRS1*

(run* (r)
      (fresh (x y)
             (conde
              ((== 'a x) (conde
                          ((== 'c y) )
                          ((== 'd y))))
              ((== 'b x) (conde
                          ((== 'e y) )
                          ((== 'f y)))))
             (== `(,x ,y) r)))
;; => '((a c) (b e) (a d) (b f))

Notice that, in the 1st experiment, TRS1 and TRS2 produced the same result, but TRS1* produced a different result.

It seems that the conde in TRS1 and TRS2 use the same search algorithm, but TRS1* use a different algorithm.

2nd experiment:

TRS1

(define listo
  (lambda (l)
    (conde
     ((nullo l) succeed)
     ((pairo l)
      (fresh (d)
             (cdro l d)
             (listo d)))
     (else fail))))

(define lolo
  (lambda (l)
    (conde
     ((nullo l) succeed)
     ((fresh (a) 
             (caro l a)
             (listo a))
      (fresh (d)
             (cdro l d)
             (lolo d)))
     (else fail))))
     
(run 5 (x)
     (lolo x))
;; => '(() (()) (() ()) (() () ()) (() () () ()))

TRS2

(defrel (listo l)
  (conde
   ((nullo l))
   ((fresh (d)
           (cdro l d)
           (listo d)))))

(defrel (lolo l)
  (conde
   ((nullo l))
   ((fresh (a)
           (caro l a)
           (listo a))
    (fresh (d)
           (cdro l d)
           (lolo d)))))

(run 5 x
     (lolo x))
;; => '(() (()) ((_0)) (() ()) ((_0 _1)))

TRS1*

(define listo
  (lambda (l)
    (conde
      ((nullo l) succeed)
      ((pairo l)
       (fresh (d)
         (cdro l d)
         (listo d)))
      (else fail))))

(define lolo
  (lambda (l)
    (conde
      ((nullo l) succeed)
      ((fresh (a) 
         (caro l a)
         (listo a))
       (fresh (d)
         (cdro l d)
         (lolo d)))
      (else fail))))

(run 5 (x)
     (lolo x))
;; => '(() (()) ((_.0)) (() ()) ((_.0 _.1)))

Notice that, in the 2nd experiment, TRS2 and TRS1* produced the same result, but TRS1 produced a different result.

It seems that the conde in TRS2 and TRS1* use the same search algorithm, but TRS1 use a different algorithm.

These makes me very confusion.

Could someone help me to clarify these different search algorithms in each minikanren implementation above?

Very thanks.

---- ADD A NEW EXPERIMENT ----

3nd experiment:

TRS1

(define (tmp-rel y)
  (conde
   ((== 'c y) )
   ((tmp-rel-2 y))))

(define (tmp-rel-2 y)
  (== 'd y)
  (tmp-rel-2 y))

(run 1 (r)
      (fresh (x y)
             (conde
              ((== 'a x) (tmp-rel y)) 
              ((== 'b x) (conde
                          ((== 'e y) )
                          ((== 'f y)))))
             (== `(,x ,y) r)))

;; => '((a c))

However, run 2 or run 3 loops.

If I use condi instead of conde, then run 2 works but run 3 still loop.

TRS2

(defrel (tmp-rel y)
  (conde
   ((== 'c y) )
   ((tmp-rel-2 y))))

(defrel (tmp-rel-2 y)
  (== 'd y)
  (tmp-rel-2 y))

(run 3 r
      (fresh (x y)
             (conde
              ((== 'a x) (tmp-rel y))
              ((== 'b x) (conde
                          ((== 'e y) )
                          ((== 'f y)))))
             (== `(,x ,y) r)))
             
;; => '((b e) (b f) (a c)) 

This is OK, except that the order is not as expected.

Notice that (a c) is at the last now.

TR1*

(define (tmp-rel y)
  (conde
   ((== 'c y) )
   ((tmp-rel-2 y))))

;;
(define (tmp-rel-2 y)
  (== 'd y)
  (tmp-rel-2 y))

(run 2 (r)
      (fresh (x y)
             (conde
              ((== 'a x) (tmp-rel y))
              ((== 'b x) (conde
                          ((== 'e y) )
                          ((== 'f y)))))
             (== `(,x ,y) r)))

;; => '((a c) (b e))

However, run 3 loops.

2

There are 2 answers

0
chansey On BEST ANSWER

After several days of research, I think I have been able to answer this question.

1. Concept clarification

First of all, I'd like to clarify some concepts:

There are two well-known models of non-deterministic computation: the stream model and the two-continuations model. Most of miniKanren implementations use the stream model.

PS. The term "backtracking" generally means depth-first search (DFS), which can be modeled by either the stream model or the two-continuations model. (So when I say "xxx get tried", it doesn't mean that the underlying implementation have to use two-continuations model. It can be implemented by stream model, e.g. minikanren.)

2. Explain the different versions of the conde or condi

2.1 conde and condi in TRS1

TRS1 provides two goal constructors for non-deterministic choice, conde and condi.

conde uses DFS, which be implemented by MonadPlus of stream.

The disadvantage of MonadPlus is that it is not fair. When the first alternative offers an infinite number of results, the second alternative is never tried. It making the search incomplete.

To solve this incomplete problem, TRS1 introduced condi which can interleave the two results.

The problem of the condi is that it can’t work well with divergence (I mean dead loop with no value). For example, if the first alternative diverged, the second alternative still cannot be tried.

This phenomenon is described in the Frame 6:30 and 6:31 of the book. In some cases you may use alli to rescue, see Frame 6:32, but in general it still can not cover all the diverged cases, see Frame 6:39 or the following case: (PS. All these problems do not exist in TRS2.)

(define (nevero)
  (all (nevero)))

(run 2 (q)
     (condi
      ((nevero))
      ((== #t q))
      ((== #f q))))
;; => divergence

Implementation details:

In TRS1, a stream is a standard stream, i.e. lazy-list.

The conde is implemented by mplus:

(define mplus
  (lambda (a-inf f)
    (case-inf a-inf
              (f)
              ((a) (choice a f))
              ((a f0) (choice a (lambdaf@ () (mplus (f0) f)))))))

The condi is implemented by mplusi

:(define mplusi
  (lambda (a-inf f)
    (case-inf a-inf
              (f) 
              ((a) (choice a f))
              ((a f0) (choice a (lambdaf@ () (mplusi (f) f0)))))) ; interleaving 

2.2 conde in TRS2

TRS2 removed the above two goal constructors and provided a new conde .

The conde like the condi, but only interleaving when the first alternative is a return value of a relation which be defined by defref. So it is actually more like the old conde if you won't use defref.

The conde also fixed the above problem of condi.

Implementation details:

In TRS2, a stream is not a standard stream.

As the book says that

A stream is either the empty list, a pair whose cdr is a stream, or a suspension.

A suspension is a function formed from (lambda () body) where (( lambda () body)) is a stream.

So in TRS2, streams are not lazy in every element, but just lazy at suspension points.

There is only one place to initially create a suspension, i.e. defref:

(define-syntax defrel
  (syntax-rules ()
    ((defrel (name x ...) g ...)
     (define (name x ...)
       (lambda (s)
         (lambda ()
           ((conj g ...) s)))))))

This is reasonable because the "only" way to produce infinite results or diverge is recursive relation. It also means that if you use define instead of defrel to define a relation, you will encounter the same problem of conde in TRS1 (It is OK for finite depth-first search).

Note that I had to put quotation marks on the "only" because most of the time we will use recursive relations, however you still can produce infinite results or diverge by mixing Scheme's named let, for example:

(run 10 q
     (let loop ()
       (conde
        ((== #f q))
        ((== #t q))
        ((loop)))))
;; => divergence

This diverged because there is no suspension now.

We can work around it by wrapping a suspension manually:

(define-syntax Zzz
    (syntax-rules ()
        [(_ g) (λ (s) (λ () (g s)))]))

(run 10 q
     (let loop ()
       (Zzz (conde
             ((== #f q))
             ((== #t q))
             ((loop)))) ))
;; => '(#f #t #f #t #f #t #f #t #f #t)   

The conde is implemented by append-inf:

(define (append-inf s-inf t-inf)
  (cond
    ((null? s-inf) t-inf)
    ((pair? s-inf) 
     (cons (car s-inf)
       (append-inf (cdr s-inf) t-inf)))
    (else (lambda () ; interleaving when s-inf is a suspension 
            (append-inf t-inf (s-inf))))))

2.3 conde in TRS1*

TRS1* originates from the early paper "From Variadic Functions to Variadic Relations A miniKanren Perspective". As TRS2, TRS1* also removed the two old goal constructors and provided a new conde.

The conde like the conde in TRS2, but only interleaving when the first alternative itself is a conde.

The conde also fixed the above problem of condi.

Note that there is no defref in TRS1*. Therefore if the recursive relations are not starting from conde, you will encounter the same problem of condi in TRS1. For example,

(define (nevero)
  (fresh (x)
         (nevero)))
          
(run 2 (q)
     (conde
      ((nevero))
      ((== #t q))
      ((== #f q))))
;; => divergence

We can work around this problem by wrapping a conde manually:

(define (nevero)
  (conde
   ((fresh (x)
          (nevero)))))

(run 2 (q)
     (conde
      ((nevero))
      ((== #t q))
      ((== #f q))
      ))
;; => '(#t #f)

Implementation details:

In TRS1*, the stream is the standard stream + suspension.

(define-syntax conde
  (syntax-rules ()
    ((_ (g0 g ...) (g1 g^ ...) ...)
     (lambdag@ (s)
               (inc ; suspension which represents a incomplete stream
                (mplus* 
                 (bind* (g0 s) g ...)
                 (bind* (g1 s) g^ ...) ...))))))

(define-syntax mplus*
  (syntax-rules ()
    ((_ e) e)
    ((_ e0 e ...) (mplus e0 (lambdaf@ () (mplus* e ...)))))) ; the 2nd arg of the mplus application must wrap a suspension, because multiple clauses of a conde are just syntactic sugar of nested conde with 2 goals.

It also means that the named let loop problem above does not exist in TRS1*.

The conde is implemented by the interleaving mplus:

(define mplus
   (lambda (a-inf f)
     (case-inf a-inf
        (f)
        ((a) (choice a f))
        ((a f^) (choice a (lambdaf@ () (mplus (f) f^)))) 
        ((f^) (inc (mplus (f) f^)))))) ; interleaving when a-inf is a suspension
                                       ; assuming f must be a suspension

Note that although the function is named mplus, it is not a legal MonadPlus because it does not obey MonadPlus law.

3. Explain these experiments in the question.

Now I can explain these experiments in the question.

1st experiment

TRS1 => '((a c) (a d) (b e) (b f)) , because conde in TRS1 is DFS.

TRS2 => '((a c) (a d) (b e) (b f)) , because conde in TRS2 is DFS if no defref involved.

TRS1* => '((a c) (b e) (a d) (b f)), because conde in TRS1* is interleaving (the outmost conde make the two innermost condes interleaving).

Note that if we replace conde with condi in TRS1, the result will be the same as TRS1*.

2nd experiment

TRS1 => '(() (()) (() ()) (() () ()) (() () () ())) , because conde in TRS1 is DFS. The second clause of conde in listo is never tried, since when (fresh (d) (cdro l d) (lolo d) is binded to the first clause of conde in listo it offers an infinite number of results.

TRS2 => '(() (()) ((_0)) (() ()) ((_0 _1))) , because now the second clause of conde in listo can get tried. listo and lolo being defined by defrel means that they will potentially create suspensions. When append-inf these two suspensions, each takes a step and then yield control to the other.

TRS1* => '(() (()) ((_.0)) (() ()) ((_.0 _.1)), is the same as TRS2, except that suspensions are created by conde.

Note that replacing conde with condi in TRS1 will not change the result. If you want to get the same result as TRS2 or TRS1*, wrap alli at the second clause of conde.

3rd experiment

Note that as @WillNess said in his comment of the question:

BTW I didn't know you could write (define (tmp-rel-2 y) (== 'd y) (tmp-rel-2 y)) like that, without any special minikanren form enclosing the two goals...

Yes, the 3rd experiment about TRS1 and TRS1* has a mistake:

(define (tmp-rel-2 y) ; <--- wrong relation definition!
  (== 'd y)
  (tmp-rel-2 y))

Unlike TRS2, TRS1 and TRS1* have no build-in defrel, so the define form is from Scheme, not minikaren.

We should use a special minikanren form enclosing the two goals.

Therefore,

For TRS1, we should change the definition to

(define (tmp-rel-2 y)
  (all (== 'd y)
       (tmp-rel-2 y)))

For TRS1*, there is no all constructor, but we can use (fresh (x) ...) to work around it

(define (tmp-rel-2 y)
  (fresh (x)
         (== 'd y)
         (tmp-rel-2 y)))

I made this mistake because I was not familiar with minikanren before.

However, this mistake won't affect the final result, and the explanation below for TRS1 and TRS1* are suitable for both the wrong definition and the correct definition.

TRS1 => '((a c)), because conde in TRS1 is DFS. The tmp-rel diverges at tmp-rel-2.

Note that replacing conde with condi and (run 2 ...), we will get '((a c) (b e)). This because condi can interleave. However, it still cannot print the third solution (b f) because condi can’t work well with divergence.

TRS2 => '((b e) (b f) (a c)) , because TRS2 can archive complete search if we use defrel to define relation.

Note that the final result is '((b e) (b f) (a c)) instead of '((a c) (b e) (b f)) because in TRS2, a suspension only initially be created by defrel. If we expect '((a c) (b e) (b f)), we can wrap the suspension manually:

(define-syntax Zzz
    (syntax-rules ()
        [(_ g) (λ (s) (λ () (g s)))]))
        
(run 3 r
     (fresh (x y)
            (conde
             ((== 'a x) (tmp-rel y))
             ((== 'b x) (Zzz (conde ; wrap a suspension by Zzz
                              ((== 'e y) )
                              ((== 'f y))))))
            (== `(,x ,y) r)))

;; => '((a c) (b e) (b f)) 

TRS1* => '((a c) (b e)), because in TRS1*, suspensions be wrapped at condes .

Note that it still cannot print the third solution (b f) because tmp-rel-2 does not be wrapped in conde, so no suspension is created here. If we expect '((a c) (b e) (b f)), we can wrap the suspension manually:

(define (tmp-rel-2 y)
  (conde ((== 'd y) (tmp-rel-2 y)))) ; wrap a suspension by conde

4. Conclusion

All in all, minikanren is not one language but families of languages. Each minikanren implementation may have its own hack. There may be some corner cases which have slightly different behaviors in different implementations. Fortunately, minikanren is easy to understand. When encountering these corner cases, we can solve them by reading the source code.

5. References

  1. The Reasoned Schemer, First Edition (MIT Press, 2005)

  2. From Variadic Functions to Variadic Relations - A miniKanren Perspective

  3. The Reasoned Schemer, Second Edition (MIT Press, 2018)

  4. µKanren: A Minimal Functional Core for Relational Programming

  5. Backtracking, Interleaving, and Terminating Monad Transformers

0
Will Ness On

Your first experiment in TRS1 implementation, in Prolog ("and" is ,, "or" is ;) and in an equivalent symbolic Logic notation ("and" is *, "or" is +), proceeds as if

ex1_TRS1( R )
  := (   X=a  , ( Y=c   ;    Y=d ) ;   X=b  , ( Y=e   ;    Y=f ) ) ,  R=[X,Y]  ;; Prolog 

  == (  {X=a} * ({Y=c}  +   {Y=d}) +  {X=b} * ({Y=e}  +   {Y=f}) ) * {R=[X,Y]}  ;; Logic

  == ( ({X=a}*{Y=c} + {X=a}*{Y=d}) + ({X=b}*{Y=e} + {X=b}*{Y=f}) ) * {R=[X,Y]}  ;; 1

 ----( (    <A>     +     <B>    ) + (    <C>     +     <D>    ) )------------
 ----(      <A>     +     <B>      +      <C>     +     <D>      )------------

  == (  {X=a}*{Y=c} + {X=a}*{Y=d}  +  {X=b}*{Y=e} + {X=b}*{Y=f}  ) * {R=[X,Y]}  ;; 2

  ==    {X=a}*{Y=c}*{R=[X,Y]}                                       ;; Distribution
                    + {X=a}*{Y=d}*{R=[X,Y]} 
                                   +  {X=b}*{Y=e}*{R=[X,Y]} 
                                                  + {X=b}*{Y=f}*{R=[X,Y]}
  ==    {X=a}*{Y=c}*{R=[a,c]} 
                    + {X=a}*{Y=d}*{R=[a,d]}                           ;; Reconciling
                                   +  {X=b}*{Y=e}*{R=[b,e]} 
                                                  + {X=b}*{Y=f}*{R=[b,f]}
                                                                        ;; Reporting
   ==               {R=[a,c]} +   {R=[a,d]} +     {R=[b,e]} +   {R=[b,f]}

;; => ((a c) (a d) (b e) (b f))  

The * operation must perform some validations, so that {P=1}*{P=2} ==> {}, i.e. nothing at all, since those two assignments are inconsistent with one another. It can also perform simplifications by substitution, going from {X=a}*{Y=c}*{R=[X,Y]} to {X=a}*{Y=c}*{R=[a,c]}.

Evidently, in this implementation, ((<A> + <B>) + (<C> + <D>)) == (<A> + <B> + <C> + <D>) (as seen in the ;; 1 --> ;; 2 step). Apparently it is the same in TRS2:

ex1_TRS2( [X,Y] )  :=  ( X=a, (Y=c ; Y=d)  ;  X=b, (Y=e ; Y=f) ).
;; => ((a c) (a d) (b e) (b f))  

But in TRS1* the results' ordering is different,

ex1_TRS1_star( R ) :=  ( X=a, (Y=c ; Y=d)  ;  X=b, (Y=e ; Y=f) ), R=[X,Y].
;; => ((a c) (b e) (a d) (b f))

so there it must have been ((<A> + <B>) + (<C> + <D>)) == (<A> + <C> + <B> + <D>).

Up to the ordering, the results are the same.

There's no search algorithm in the book, just the solutions streams' mixing algorithm. But since the streams are lazy it achieves the same thing.

You can go through the rest in the same manner and discover more properties of + in each particular implementation.