Re: Rich Comparisons Gotcha



On Tue, 06 Jan 2009 12:42:13 +0000, Mark Wooding wrote:

Steven D'Aprano <steven@xxxxxxxxxxxxxxxxxxxxxxxxxxxxxx> wrote:

Such assumptions only hold under particular domains though. You can't
assume equality is an equivalence relation once you start thinking
about arbitrary domains.

From a formal mathematical point of view, equality /is/ an equivalence
relation. If you have a relation on some domain, and it's not an
equivalence relation, then it ain't the equality relation, and that's
flat.

Okay, fair enough. In the formal mathematical sense, equality is always
an equivalence relation. So there are certain domains which don't have
equality, e.g. floating point, since nan != nan. Also Python objects,
since x.__eq__(y) is not necessarily the same as y.__eq__(x).



But there cannot be any such function which is a domain-independent
equivalence relation, not if we're talking about arbitrarily wacky
domains.

That looks like a claim which requires a proof to me. But it could also
do with a definition of `domain', so I'll settle for one of those first.

I'm talking about domain in the sense of "a particular problem domain".
That is, the model, data and operations used to solve a problem. I don't
know that I can be more formal than that.

To prove my claim, all you need is two domains with a mutually
incompatible definition of equality. That's not so difficult, surely? How
about equality of integers, versus equality of integers modulo some N?



If we're dealing with sets (i.e., `domain's form a subclass of `sets')
then the claim is clearly false, and equality (determined by comparison
of elements) is indeed a domain-independent equivalence relation.

It isn't domain-independent in my sense, because you have specified one
specific domain, namely set equality.


Even something as straight-forward as "is" can't be an equivalence
relation under a domain where identity isn't well-defined.

You've completely lost me here. The Python `is' operator is (the
characteristic function of) an equivalence relation on Python values:
that's its definition.

Yes, that's because identity is well-defined in Python. I'm saying that
if identity isn't well-defined, then neither is the 'is' operator, and
therefore it isn't an equivalence relation. That shouldn't be
controversial.



All Python objects are instances of `object' or of some more specific
class. The `==' operator on `object' is (the characteristic function
of) an equivalence relation. In, fact, it's the same as `is' -- but
`==' can be overridden by subclasses, and subclasses are permitted --
according to the interface definition -- to coarsen the relation. In
fact, they're permitted to make it not be an equivalence class at all.

I claim that this is a problem.

It *can* be a problem, if you insist on using == on arbitrary types while
still expecting it to be an equivalence relation.

If you drop the requirement that it remain an e-r, then you can apply ==
to arbitrary types. And if you limit yourself to non-arbitrary types,
then you can safely use (say) any strings you like, and == will remain an
e-r.



I /agree/ that domain-specific
predicates are useful, and can be sufficiently useful that they deserve
the `==' name -- as well as floats and numpy, I've provided SAGE and
sympy as examples myself. But I also believe that there are good
reasons to want an `equivalence' operator (I'll write it as `=~', though
I don't propose this as Python syntax -- see below) with the following
properties:

* `=~' is the characteristic function[1] of an equivalence relation,
i.e., for all values x, y, z: x =~ y in (True, False); (x =~ x) ==
True; if x =~ y then y =~ x; and if x =~ y and y =~ z then x =~ z

* Moreover, `=~' is a coarsening of `is', i.e. for all values x, y: if
x is y then x =~ y.


Ah, but you can't have such a generic e-r that applies across all problem
domains. Consider:

Let's denote regular, case-sensitive strings using "abc", and special,
case-insensitive strings using i"abc". So for regular strings, equality
is an e-r; for case-insensitive strings, equality is also an e-r (I
trust that the truth of this is obvious). But if you try to use equality
on *both* regular and case-insensitive strings, it fails to be an e-r:

i"abc" =~ "ABC" returns True if you use the case-insensitive definition
of equality, but returns False if you use the case-sensitive definition.
There is no single definition of equality that is *simultaneously* case-
sensitive and case-insensitive.


A valuable property might be that x =~ y if x and y are
indistinguishable without using `is'.

That's a little strong, because it implies that equality must look at
*everything* about a particular object, not just whatever bits of data
are relevant for the problem domain.

For example, consider storing data in a dict.

D1 = {-1: 0, -2: 0}
D2 = {-2: 0}
D2[-1] = 0
D1 == D2
True


We certainly want D1 and D2 to be equal. But their history is different,
and that makes their internal details different, which has detectable
consequences:

D1
{-2: 0, -1: 0}
D2
{-1: 0, -2: 0}


The same happens with trees. Given a tree structure defined as:

(payload, left-subtree, right-subtree)

do you want the following two trees to be equal?

('b', ('a', None, None), ('c', None, None))

('a', None, ('b', None, ('c', None, None)))

Unless I've made a silly mistake, not only are the payloads of the two
trees equal, but so are the in-order representation of both. Only the
specific order the nodes are stored in differ, and that may not be
important for the specific problem you are trying to solve.

There may be problem domains where the order of elements in a list (or
tree structure) *is* important, and other problem domains where order is
irrelevant. One single relation can't cover all such conflicting
requirements.



--
Steven
.



Relevant Pages

  • Re: An uncountable countable set
    ... Han de Bruijn wrote: ... which is an Equivalence Relation, which in turn is a "generalization" ... that EQUALITY HAS NEVER BEEN DEFINED. ... Consider the equally spaced staircase from to, as the number of steps increases from 1 without bound. ...
    (sci.math)
  • Re: Rich Comparisons Gotcha
    ... incompatible definition of equality. ... versus equality of integers modulo some N? ... a perfectly fine equivalence relation -- but that it can potentially ... are relevant for the problem domain. ...
    (comp.lang.python)
  • Re: JSH: Logic and paradox
    ... Equality is an equivalence relation. ... No offence, but even in logic you need context, you can't just take notation ... the equality symbol. ...
    (sci.math)
  • Re: What is a proof, exactly?
    ... > the formalization of the list of theorems. ... That only the equality of sets may be considered relevant? ... If I have an equivalence relation ... becomes quite subtle now. ...
    (sci.math)
  • Re: JSH: Logic and paradox
    ... Myself, I prefer to understand "=" as an equivalence relation in all cases, ... equality as a mere equivalence relation and *not* a proper identity. ... By the profit criterion, Microsoft has been one of the greatest ... -- ADTI defends Microsoft ...
    (sci.math)