Collect all "minimum" solutions from a predicate

1.6k views Asked by At

Given the following facts in a database:

foo(a, 3).
foo(b, 2).
foo(c, 4).
foo(d, 3).
foo(e, 2).
foo(f, 6).
foo(g, 3).
foo(h, 2).

I want to collect all first arguments that have the smallest second argument, plus the value of the second argument. First try:

find_min_1(Min, As) :-
    setof(B-A, foo(A, B), [Min-_|_]),
    findall(A, foo(A, Min), As).

?- find_min_1(Min, As).
Min = 2,
As = [b, e, h].

Instead of setof/3, I could use aggregate/3:

find_min_2(Min, As) :-
    aggregate(min(B), A^foo(A, B), Min),
    findall(A, foo(A, Min), As).

?- find_min_2(Min, As).
Min = 2,
As = [b, e, h].

NB

This only gives the same results if I am looking for the minimum of a number. If an arithmetic expression in involved, the results might be different. If a non-number is involved, aggregate(min(...), ...) will throw an error!

Or, instead, I can use the full key-sorted list:

find_min_3(Min, As) :-
    setof(B-A, foo(A, B), [Min-First|Rest]),
    min_prefix([Min-First|Rest], Min, As).

min_prefix([Min-First|Rest], Min, [First|As]) :-
    !,
    min_prefix(Rest, Min, As).
min_prefix(_, _, []).

?- find_min_3(Min, As).
Min = 2,
As = [b, e, h].

Finally, to the question(s):

  • Can I do this directly with library(aggregate)? It feels like it should be possible....

  • Or is there a predicate like std::partition_point from the C++ standard library?

  • Or is there some easier way to do this?

EDIT:

To be more descriptive. Say there was a (library) predicate partition_point/4:

partition_point(Pred_1, List, Before, After) :-
    partition_point_1(List, Pred_1, Before, After).

partition_point_1([], _, [], []).
partition_point_1([H|T], Pred_1, Before, After) :-
    (   call(Pred_1, H)
    ->  Before = [H|B],
        partition_point_1(T, Pred_1, B, After)
    ;   Before = [],
        After = [H|T]
    ).

(I don't like the name but we can live with it for now)

Then:

find_min_4(Min, As) :-
    setof(B-A, foo(A, B), [Min-X|Rest]),
    partition_point(is_min(Min), [Min-X|Rest], Min_pairs, _),
    pairs_values(Min_pairs, As).

is_min(Min, Min-_).

?- find_min_4(Min, As).
Min = 2,
As = [b, e, h].
3

There are 3 answers

0
AudioBubble On BEST ANSWER

Using library(pairs) and [sort/4], this can be simply written as:

?- bagof(B-A, foo(A, B), Ps),
   sort(1, @=<, Ps, Ss), % or keysort(Ps, Ss)
   group_pairs_by_key(Ss, [Min-As|_]).
Min = 2,
As = [b, e, h].

This call to sort/4 can be replaced with keysort/2, but with sort/4 one can also find for example the first arguments associated with the largest second argument: just use @>= as the second argument.

This solution is probably not as time and space efficient as the other ones, but may be easier to grok.

But there is another way to do it altogether:

?- bagof(A, ( foo(A, Min), \+ ( foo(_, Y), Y @< Min ) ), As).
Min = 2,
As = [b, e, h].
4
CapelliC On

I don't think that library(aggregate) covers your use case. aggregate(min) allows for one witness:

min(Expr, Witness) A term min(Min, Witness), where Min is the minimal version of Expr over all solutions, and Witness is any other template applied to solutions that produced Min. If multiple solutions provide the same minimum, Witness corresponds to the first solution.

Some time ago, I wrote a small 'library', lag.pl, with predicates to aggregate with low overhead - hence the name (LAG = Linear AGgregate). I've added a snippet, that handles your use case:

integrate(min_list_associated, Goal, Min-Ws) :-
    State = term(_, [], _),
    forall(call(Goal, V, W),    % W stands for witness
        (    arg(1, State, C),  % C is current min
             arg(2, State, CW), % CW are current min witnesses
             (   ( var(C) ; V @< C )
             ->  U = V, Ws = [W]
             ;   U = C,
                 (   C == V
                 ->  Ws = [W|CW]
                 ;   Ws = CW
                 )
             ),
             nb_setarg(1, State, U),
             nb_setarg(2, State, Ws)
        )),
    arg(1, State, Min), arg(2, State, Ws).

It's a simple minded extension of integrate(min)... The comparison method it's surely questionable (it uses less general operator for equality), could be worth to adopt instead a conventional call like that adopted for predsort/3. Efficiency wise, still better would be to encode the comparison method as option in the 'function selector' (min_list_associated in this case)

edit thanks @false and @Boris for correcting the bug relative to the state representation. Calling nb_setarg(2, State, Ws) actually changes the term' shape, when State = (_,[],_) was used. Will update the github repo accordingly...

0
false On

What is the idiomatic approach to this class of problems?

Is there a way to simplify the problem?

Many of the following remarks could be added to many programs here on SO.

Imperative names

Every time, you write an imperative name for something that is a relation you will reduce your understanding of relations. Not much, just a little bit. Many common Prolog idioms like append/3 do not set a good example. Think of append(As,As,AsAs). The first argument of find_min(Min, As) is the minimum. So minimum_with_nodes/2 might be a better name.

findall/3

Do not use findall/3 unless the uses are rigorously checked, essentially everything must be ground. In your case it happens to work. But once you generalize foo/2 a bit, you will lose. And that is frequently a problem: You write a tiny program ; and it seems to work. Once you move to bigger ones, the same approach no longer works. findall/3 is (compared to setof/3) like a bull in a china shop smashing the fine fabric of shared variables and quantification. Another problem is that accidental failure does not lead to failure of findall/3 which often leads to bizarre, hard to imagine corner cases.

Untestable, too specific program

Another problem is somewhat related to findall/3, too. Your program is so specific, that it is quite improbable that you will ever test it. And marginal changes will invalidate your tests. So you will soon give up to perform testing. Let's see what is specific: Primarily the foo/2 relation. Yes, only an example. Think of how to set up a test configuration where foo/2 may change. After each change (writing a new file) you will have to reload the program. This is so complex, chances are you will never do it. I presume you do not have a test harness for that. Plunit for one, does not cover such testing. As a rule of thumb: If you cannot test a predicate on the top level you never will. Consider instead

minimum_with(Rel_2, Min, Els)

With such a relation, you can now have a generalized xfoo/3 with an additional parameter, say:

xfoo(o, A,B) :-
   foo(A,B).
xfoo(n, A,B) :-
   newfoo(A,B).

and you most naturally get two answers for minimum_with(xfoo(X), Min, Els). Would you have used findall/3 instead of setof/3 you already would have serious problems. Or just in general: minmum_with(\A^B^member(A-B, [x-10,y-20]), Min, Els). So you can play around on the top level and produce lots of interesting test cases.

Unchecked border cases

Your version 3 is clearly my preferred approach, however there are still some parts that can be improved. In particular, if there are answers that contain variables as a minimum. These should be checked.

And certainly, also setof/3 has its limits. And ideally you would test them. Answers should not contain constraints, in particular not in the relevant variables. This shows how setof/3 itself has certain limits. After the pioneering phase, SICStus produced many errors for constraints in such cases (mid 1990s), later changed to consequently ignoring constraints in built-ins that cannot handle them. SWI on the other hand does entirely undefined things here. Sometimes things are copied, sometimes not. As an example take: setof(A, ( A in 1..3 ; A in 3..5 ), _) and setof(t, ( A in 1..3 ; A in 3.. 5 ), _).

By wrapping the goal this can be avoided.

call_unconstrained(Goal_0) :-
   call_residue_vars(Goal_0, Vs),
   ( Vs = [] -> true ; throw(error(representation_error(constraint),_)) ).

Beware, however, that SWI has spurious constraints:

?- call_residue_vars(all_different([]), Xs).
   Xs = [_A].

Not clear if this is a feature in the meantime. It has been there since the introduction of call_residue_vars/2 about 5 years ago.