Why is SWI-Prolog only giving one solution?

2.2k views Asked by At

I'll be honest, I'm a Prolog newbie, so please excuse my ignorance.

I have a simple predicate to count the occurences of an atom in a list, as follows:

count(L, B, C) :-
    L = [], C = 0, !;
    L = [H|T], H \= B, count(T, B, C), !;
    L = [H|T], H = B, count(T, B, C1), C is C1 + 1.

The following queries return the correct results:

?- count([a, c, g, t], a, C).
C = 1.

?- count([a, c, g, t], c, C).
C = 1.

?- count([a, c, g, t], g, C).
C = 1.

?- count([a, c, g, t], t, C).
C = 1.

However, if I try to find all possible solutions, it only gives one.

?- count([a, c, g, t], X, C).
X = a,
C = 1.

How do I get it to give all the solutions? I though it could have something to do with the cut operator, but removing it doesn't seem to work either.

4

There are 4 answers

0
mat On BEST ANSWER

The cuts are indeed one problem: Try for example the most general query

?- count(Ls, L, C).

and see that it yields only a single solution, although there clearly should be infinitely many because the first argument can be a list of arbitrary length. So first remove all cuts. The other problem is (\=)/2, which is not a true relation: It is only sound if its arguments are ground. Instead of (\=)/2, use the more general predicate dif/2, which is available in SWI-Prolog as well as other systems and constrains its arguments to be different terms. Your predicate will then work in all directions.

EDIT: I expand on the "usable in all directions" point. Consider the following version of list_term_count/3, which relates a list to the number of occurrences of a term in that list, using constraints in addition to dif/2:

list_term_count([], _, 0).
list_term_count([L|Ls], L, N) :-
        list_term_count(Ls, L, N0),
        N #= N0 + 1.
list_term_count([L|Ls], E, N) :-
        dif(L, E),
        list_term_count(Ls, E, N).

We can use it in the most general way imaginable, which is leaving all arguments unspecified, and obtain correct answers:

?- list_term_count(Ls, E, N).
Ls = [],
N = 0 ;
Ls = [E],
N = 1 .

To fairly enumerate all solutions, we can use length/2:

?- length(Ls, _), list_term_count(Ls, E, N).
Ls = [],
N = 0 ;
Ls = [E],
N = 1 ;
Ls = [_G167],
N = 0,
dif(_G167, E) .

Notice the dif/2 constraint which occurs as a residual goal, and which constrains the list's element to be distinct from E when N is 0. This is how we can express an infinite set of terms that is not restricted in any other way except for being different from E.

Any other instantiation pattern is also admissible. For example:

?- list_term_count([a], E, N).
E = a,
N = 1 ;
N = 0,
dif(E, a).

Or for example:

?- list_term_count([X], a, N).
X = a,
N = 1 ;
N = 0,
dif(X, a).

This generality is one of the benefits of using pure monotonic predicates in your programs. Using pure goals also allows us to quite freely reorder them.

1
luksan On

Well this seemed to work.

base(a).
base(c).
base(g).
base(t).

count(L, B, C) :-
    base(B),
    (
        L = [], C = 0;
        L = [H|T], H = B, count(T, B, C1), C is C1 + 1;
        L = [H|T], H \= B, count(T, B, C)
    ).

That makes sense, since Prolog doesn't have any other way to know what the domain of B is I suppose. I'm just surprised it didn't give me that 'argument not sufficient instantiated' error the original way. Also, if I include the cut operators it doesn't work, which I'm also unable to account for as of yet.

4
CapelliC On

I think you're exploring the «dark side» of Prolog: aggregation.

Indeed, your count/3 predicate could be - courtesy of library(aggregate) -

count(L, B, C) :-
    aggregate(count, member(B, L), C).

Being a language based on a relational data model, we are akin to expect the usual goodies available from SQL, starting from the simpler ones:

select count( distinct B ) from L

If you inspect the library implementation, (for instance by means of ?- edit(aggregate/3).), you will appreciate the complexity required to generalize the 'one tuple at time' oriented Prolog execution model to a 'set oriented' one.

4
Cephalopod On

The unexpected behavior comes from Prolog's Unification, which always tries to succeed.

Read H \= B as not(H = B). When B is unbound, H = B will always succeed, because unification is possible, thus not(...) will always fail. As a result, recursion will only occur in the third branch, after B was bound to H.

Let's imagine H \= B would succeed for an unbound B, and the recursion happens. Now, if the same element H occurs again, it may be bound to B this time, which will give you multiple results for each value. E.g., count([a,a],X,C) would return X = a, C = 1. X = a, C = 2.. Certainly not what you wanted.

However, if I try to find all possible solutions, it only gives one.

?- count([a, c, g, t], X, C). X = a, C = 1.

What would you say are "all possible solutions"? The are infinite values for X for which C is zero (Protip: try ?- X.). Or do you mean all values in the list?

There is a very simple solution to your problem: just make sure B is bound when \= is reached. The easiest way to achieve this is simply changing the order of predicates, i.e., move the inequality to the end.

% count(?List, ?Element, ?Count)
count([], _, 0).
count([E|R], E, C) :-
     count(R, E, C0),
     C is C0+1.
count([F|R], E, C) :-
     count(R, E, C),
     F \= E.

Optimizations are left as an exercise to the reader.

A final note: don't use cuts (!) until you actually understand them, most of the times you won't need them anyway.