Visual Analysis of the Earth Mover's Distance

Many recent research papers [1,2] focused on neural networks for 3D point cloud data use one of the two distance metrics as loss function or evaluation metric: the Chamfer distance (\(D_{CD}\)) and the Earth Mover's distance (\(D_{EMD}\)). The calculation of the Chamfer distance is easy to grasp, while the Earth Mover's distance is less clearly defined. In this post I want to give an overview of the Earth Mover's distance and in particular how it is used for training point cloud data. The Earth Mover's distance is also known under the term 1-Wasserstein metric, but we will refer to it as the Earth Mover's distance in this post.

Introduction

Let's first look at the mathematical definition of the Earth Mover's distance:

$$ D_{EMD}(S_{1},S_{2}) = \min_{\phi: S_{1} \to S_{2} } \frac{1}{|S_{1}|} \sum_{x \in S_{1}} \| x - \phi(x) \|_{2}$$

The function \(\phi\) is a bijection (this also implies that \(D_{EMD}\) is only defined for \(|S_{1}| = |S_{2}|\)). This can be seen as a matching between points in \(S_{1}\) to points in \(S_{2}\). The matching with the minimal euclidean distance therefore describes the Earth Mover's distance. A common interpretation of the Earth Mover's distance describes the measure as the minimum amount of work needed to spread a mass of earth collected in piles to fill a set of holes in the same given space [4,5]. In this interpretation work is determined both by the weight of the earth moved and the distance that the earth had to be moved to fill the holes.

References

[1] Yuan, Khot, Held, Mertz, Herbert (3DV 2018) PCN: Point Completion Network

[2] Fan, Su, Leonidas (CVPR 2017) A Point Set Generation Network for 3D Object Reconstruction from a Single Image

[3] Lee, Batra, Baig, Ulbricht (CVPR 2019) Sliced Wasserstein Discrepancy for Unsupervised Domain Adaptation.

[4] CVOnline on the Earth Mover's Distance The Earth Mover's Distance

[5] Kantorovitch, Lee (Management Science 1958) On the Translocation of Masses