# Re: Complex Subset Computing in Prolog

On Dec 29, 5:25 am, bart demoen <b...@xxxxxxxxxxxxxx> wrote:
On Mon, 29 Dec 2008 00:46:23 -0800, bird5467 wrote:
On 28 Dez., 17:47, prolognerd <hm.ja...@xxxxxxxxx> wrote:
Hi,
your subset rule does not deliver the correct subsets. Cu
Josef- Hide quoted text -

- Show quoted text -

Josef,

I think they do, specially for ground terms. For example for terms such
as subset([1,4],[4,2,3,1,6]), it works just fine. I needed this rule
because I did not want it to be order dependent. But the problem is
with goals like subset([X,Y],[4,2,3,1,6]), it does not work properly.
It may so happen that unification will make X=4 and Y=4 and succeed,
because they are tried as member(X,[4,2,3,1,6]), fof example and then
again member(Y,[4,2,3,1,6]). In both case it succeeds with X=4 and Y=4.

But if you try to unify p([X,Y]) with p([4,1]), it will prefectly
produce X=4, and Y=1, and fail for p(4,1,3]).

Yes. But in your example you complain that you get two 'a'. And thats
exactly because the arguments to subset are not ground and the subset
contains multiple elements.

I see the problem, that the sets/lists are not ordered. Solutions could
be:
a) Sort the input sets/lists and work with the sorted lists b) Use a
real subset and permutate them to handle all orders c) Claim explicit W
=/= X, X =/= Y,... or if the number of
variables is unknown build something like distinct(List).

I don't know your original problem. E.g. it is not clear for me whether
in the e([X,Y]) term are always two elements or more.

Cu
Josef

There is often confusion when lists and sets are used (in Prolog or
any other language, but let's do this for Prolog) and this pops up
almost always when the subset predicate raises its head.

Here is the origin of this confusion in a nutshell:

- a set is a mathematical concept (just like a tree or an algebra)
- a list is a Prolog data structure

One can use Prolog lists to represent a set, for instance the set

{x,a,p}

can be represented by the Prolog list [x,a,p], and here comes the
problem: you can also represent it as [a,p,x] (and some 4 more
permutations of that list). You are even free to represent the same
set as the Prolog list [a,x,x,p,a], which contains duplicates, even
though sets don't. Meaning that you can choose to have a set as Prolog
list representation that is not unique. That is perfectly ok, and it
gets you into trouble in many situations. It is usually better to have
a unique representation of every set. But let's not go into which is
"better" because it depends on what you are doing. The important
thing is however:

you must know what you are doing, and it helps to make it explicit

So, if you meant

a set S is represented by any list that contains all the members of S
at least once in any order and no other elements

then the predicate

subset([X|R],S) :- member(X,S), subset(R,S).
subset([],_).

says

the first argument represents a subset of the set represented in
the second argument

and you are doing fine. And don't be surprised that the number of
solutions to your subset/2 is infinite, even if the second argument is
ground.

However, if you want your representation to have certain properties of
sets (e.g. unique elements), then you better make sure that your
predicates cater for that, because you are not getting it for free !
A sorted list without duplicates would then be a good representation.
And the number of solutions to a subset/2 predicate (when the second
argument is ground) will be 2^N as in mathematics.

Cheers

Bart Demoen- Hide quoted text -

- Show quoted text -

Dear Bart and Josef,

I was aware of all of what you mentioned, but I never really seriously
programmed a lot of list type applications before. So, it took me a
while. For now, I am just lookng for a solution. Not worried too much
about efficiency (thats why I prefer using XSB, Coral or Aditi systems
that are built around Datalog type ideas [you can probably tell I am a
database person]) related issues. But I ahve to look at it eventually.
Coming to the point, I wrote the following rules and it now works,

n(a,2,[b,d],1,[e(b,d)]).
n(b,3,[a,c,d],2,[e(a,d),e(c,d)]).
n(c,4,[b,d,e,f],2,[e(e,d),e(b,d)]).
n(d,4,[a,c,b,e],3,[e(c,b),e(a,b),e(c,e)]).
n(e,2,[c,d],1,[e(c,d)]).
n(f,3,[c,h,r],1,[e(h,r)]).
n(h,3,[f,r,p],2,[e(r,f),e(r,p)]).
n(p,2,[h,r],1,[e(h,r)]).
n(r,3,[f,h,p],2,[e(f,h),e(h,p)]).

query(V, Qnc, Qnb, Qts, Qbs):- n(V, Nc, Nb, Ts, Bs), Nc >= Qnc, Ts >=
Qts, subset(Qnb,Nb), subset(Qbs,Bs).

subset([X|R],S) :- refmember(X,S), subset(R,S).
subset([],_).

refmember(X,S) :- member(X,S).
refmember(e(X,Y),S) :- member(e(Y,X),S).

dist(X,[]).
dist(X,L) :- not member(X,L).
distinct([]).
distinct([X|L]) :- dist(X,L), distinct(L).

For the query,

? query(C,4,[A,B,E,F],2,[e(A,B),e(E,F)]), query(F,3,[H,C,E],1,[e
(C,E)]), distinct([A,B,C,E,F,H]).

XSB returned the following bindings.

C = d
A = a
B = b
E = e
F = c
H = f;

C = d
A = a
B = b
E = e
F = c
H = f;

C = d
A = b
B = a
E = e
F = c
H = f;

C = d
A = b
B = a
E = e
F = c
H = f;

I am happy with

C = d
A = a
B = b
E = e
F = c
H = f;

which I was looking for. This a graph problem I am trying to solve for
protein-protein interaction networks
in Life sciences. The n predicates are supposed to model each node or
a protein the network and the query
goals are trying to find a sub-structure of each other proteins that
looks like the query goals. For example,
for query(C,4,[A,B,E,F],2,[e(A,B),e(E,F)]), n(d,4,[a,c,b,e],3,[e(a,b),e
(c,e)]) is a substructure, but not
n(d,4,[a,c,b,e],3,[e(c,b),e(a,b)]) or n(d,4,[a,c,b,e],3,[e(c,b),e
(c,e)]). Because all variables must unify
with unique terms. The distinct subgoal makes sure of that. I was
hoping that I could do away without adding
this subgoal. Also not that e(a,b) is identical to e(b,a). The
refmember (reflective member) rules make sure
that we do not miss that sight when we look for subsets.

Now, I am facing a new problem. These protein graphs are actually in
the form

p(a,b).
p(a,c).
p(b,c).
p(b,d).
....

From this representation, I need to convert them to the n
representation above. To do that, I was hoping to use
Prolog's findall construct. But the version of Prolog I am using, does
not support many of the features standard
Prolog supports, inclusing findall. So, I wrote my own version of find-
all as follows.

find-all(X,Goal,Bag) :- post_it(X,Goal), gather([],Bag).

post_it(X,Goal) :- call(Goal), asserta(data999(X)), fail.
post_it(_,_).

gather(B,Bag) :- data999(X), retract(data999(X)), gather([X|B],Bag),
distinct(Bag), !.
gather(S,S).

dist(X,[]).
dist(X,L) :- not member(X,L).
distinct([]).
distinct([X|L]) :- dist(X,L), distinct(L).

When I query find-all(n(X,Y), p(X,Y), L), find-all now returns L = [n
(a,b), n(a,c), n(b,c), n(b,d)] just fine.
From these, I can possibly write a few more rules to generate what I
need. But is there a better way to collect
all the Y columns of p(X,Y) in the form of set(X, [Y1, ..., Yn]) where
Yis are the values in the second column.
I think XSB's findall may be useful, but I am not certain if it will
act like groupby in SQL. I need to write it
because many Prolog versions appears to be missing this feature.

Thanks once again.

- Hasan
.

## Relevant Pages

• Re: how to deal with various representations of matrices ?
... REPL) that happens to be in various formats (2D arrays, lists of ... representation. ... Now, the nice thing is that once you've wrote the 2N adaptors, you can ...   ...
(comp.lang.lisp)
• Re: N-QUEENS problem with some differences
...     M is K-1, ... little different from Prolog. ... If the syntax for lists and "anonymous variables" ... squares and the position of queens as another ...
(comp.lang.prolog)
• Re: Complex Subset Computing in Prolog
...    variables is unknown build something like distinct. ... There is often confusion when lists and sets are used (in Prolog or ... but let's do this for Prolog) and this pops up ... list representation that is not unique. ...
(comp.lang.prolog)
• Re: N-QUEENS problem with some differences
... I confirm you that a "queen" who is on a square, ...     M is K-1, ... little different from Prolog. ... If the syntax for lists and "anonymous variables" ...
(comp.lang.prolog)
• Re: Complex Subset Computing in Prolog
... There is often confusion when lists and sets are used (in Prolog or ... but let's do this for Prolog) and this pops up ... almost always when the subset predicate raises its head. ... list representation that is not unique. ...
(comp.lang.prolog)