DRMacIver's Notebook

Testing if nearness is an equivalence.

Testing if nearness is an equivalence.

Suppose you have points \(X = \{x_1, \ldots, x_n\}\) and an arbitrary pseudometric \(d: X^2 \to \mathbb{R}^+\). Let \(\epsilon > 0\). Can you decide whether the relationship \(x \tilde y \equiv d(x, y) \leq \epsilon\) is an equivalence relationship in better than \(O(n^3)\) time?

You certainly can’t do it in less than \(O(n^2)\) time. Consider \(d(x_i, x_j) = 1_{(i, j) = (u, v)}\) and \(\epsilon = \frac{1}{2}\). Any algorithm that has not checked all pairs \(u, v\) will fail to distinguish one of these from the pseudometric that is always \(0\).

I’m reasonably confident the answer is that there is no better than \(O(n^3)\) solution but I haven’t checked. I’m just writing this problem down because I popped into my head and I wanted to get it out of there.

Update: No I’m being stupid. There’s a general \(O(n^2 \log(n))\) algorithm to check if any reflexive and symmetric relationship is an equivalence relationship. It works as follows:

import itertools

def is_equivalence(points, r):
  vec_to_fingerprint = {}

  fingerprints = {
    x: vec_to_fingerprint.setdefault(
      tuple(r(x, y) for y in points),
      len(vec_to_fingerprint)
    )
    for x in points
  }

  return all(
    not r(u, v) or (fingerprints[u] == fingerprints[v])
    for u, v in itertools.combinations(points, 2)
  )

We “fingerprint” each point by the set of points it is equivalent to, turning it into integers we can compare in \(O(\log(n))\) time. We now test whether every equivalent pair has the same fingerprint. If yes, then the equivalence relationship is just whether the two points have the same fingerprint. If no, that witnesses an intransitivity.

I don’t know if you can get rid of the \(\log(n)\) factor, but now I really don’t care.