Check if any element's frequency is above a limit

325 views Asked by At

I want to solve a problem that is I have a Prolog list of elements. If the any of the element frequency is greater than N then false is return. My expectation like below.

?- frequency([1,2,2,2,5],3).
true.

?- frequency([1,2,2,2,2,5],3).
false.

I have a code for get particular element frequency. Any idea for the problem.

count(_, [], 0) :-
   !.
count(X, [X|T], N) :-
   count(X, T, N2),
   N is N2 + 1.
count(X, [Y|T], N) :-
   X \= Y,
   count(X, T, N).
3

There are 3 answers

1
repeat On BEST ANSWER

Use !

:- use_module(library(clpfd)).

If we build on auxiliary predicate list_counts/2, we can define frequency/2 like this:

frequency(Es, M) :-
   list_counts(Es, Xss),
   maplist(arg(2), Xss, Zs),
   maplist(#>=(M), Zs).

Sample queries:

?- frequency([1,2,2,2,5], 3).
true.

?- frequency([1,2,2,2,2,5], 3).
false.

Thanks to we can ask quite general queries—and get logically sound answers, too!

?- frequency([A,B,C], 2).
       A=B ,           dif(B,C)
;                A=C , dif(B,C)
;            dif(A,C),     B=C
;  dif(A,B), dif(A,C), dif(B,C).
8
Giththan On

What is about this code...

frequency(L,N):-getall(L,L1), max_member(A,L1),A=<N.

getall([],[]).
getall(L,N):-append([],[X1|T],L),count(X1,L,N1),getall(T,N2),append([N1],N2,N).
8
AudioBubble On

You can first try to think about the most "cheap" (in terms of writing) way you could attack such a problem. I usually try to figure out how I would do it with standard command line tools. For example, to find the occurrences of all lines in a text file named foo, one would write (in Bash):

sort foo | uniq --count

You can read the manuals, but here is one example run:

$ cat foo
a
b
b
b
c
d
d
a
b
d
c
$ sort foo | uniq --count
      2 a
      4 b
      2 c
      3 d

Now, one way to ask "is there any line that has a count above 3?" is to use awk like this:

sort foo | uniq --count | awk '{ if ($1 > 3) exit(1) }'

(I am sure there are cleverer ways, too.)

With the above file, you get:

$ sort foo | uniq --count | awk '{ if ($1 > 3) exit(1) }'
$ echo $?
1
$ sort foo | uniq --count | awk '{ if ($1 > 4) exit(1) }'
$ echo $?
0

Alright, so how does that help you with Prolog? Well, one easy way to emulate a pipeline like this:

foo | bar | baz # etc

is to write in Prolog a conjunction like this:

foo(In, X0), bar(X0, X1), baz(X1, X2) % etc

Back to your problem: you can use msort/2 (or however a stable sort predicate is called in the implementation you are using). Then, you need to count runs of the same element. In SWI-Prolog at least you have group_pairs_by_key/2. You could use it for example as follows (together with the other predicates in the same library, you can see the code at the same link):

pairs_keys_values(Pairs, Sorted, Sorted),
group_pairs_by_key(Pairs, Grouped),
pairs_values(Grouped, Runs),
maplist(length, Runs, Counts)

At that point, you have the number of occurrences of each element in Sorted in Counts (a bit more verbose than uniq --count, admittedly), and you just need to check if any of these is above your limit. To do something very similar to the awk invocation above in Prolog:

maplist(=<(3), Counts)

Disclaimer: this is just one way to approach the problem. I decided to type it out because it shows that you rarely need to write much code yourself if you know what tools are already available to you.

Edit

It is of course unnecessary to use group_pairs_by_key/2; however, it is very useful to know about it, which is why I linked the implementation. For this problem, it would be enough to do a stable sort, followed by a predicate that simply counts the number of consecutive occurrences of the same element, and only succeed if all such runs are not longer than the limit. The basic structure of a predicate that does that is the same as group_pairs_by_key/2.