What is the Fisher-Rao distance?
Introduction
How can we measure the difference between two probability distributions in a principled way? One approach is through the Fisher-Rao distance, a geometric measure of dissimilarity based on the Fisher information matrix (FIM).
To understand this distance, we first examine the role of the FIM in statistical inference and its interpretation as a Riemannian metric. This metric structure naturally leads to the Fisher-Rao distance, which measures the shortest path—or geodesic—between distributions on the statistical manifold. However, directly computing this distance is often intractable.
To address this challenge, we introduce an alternative approach: the square-root transformation, which embeds probability densities into a Hilbert space. A key result follows from this transformation: the Fisher-Rao distance is exactly twice the geodesic distance between transformed densities on the unit sphere. This connection not only simplifies computations but also offers new insights into the geometry of probability distributions.
Fisher Information Matrix
In a standard mathematical statistics course, one encounters the Fisher information matrix (FIM) for a probability density function
The Fisher information matrix plays a central role in statistical inference. In maximum likelihood estimation (MLE), it helps assess the asymptotic variance of the MLE:
Fisher-Rao Metric
A key discovery in information geometry (Amari et al. 2007) is that the FIM induces a Riemannian metric on the statistical manifold
Fisher-Rao Distance
Since the Fisher-Rao metric defines a Riemannian structure, one can define a distance between two points in the manifold. The Fisher-Rao distance between two distributions,
Computing the Fisher-Rao Distance
While theoretically elegant, computing the Fisher-Rao distance as in Equation 1 is challenging because it requires solving the geodesic equations:
Square-Root Transformation and Geodesic Distance
A practical alternative for computing the Fisher-Rao distance is the square-root transformation:
It is straightforward to verify that the transformed functions lie on the infinite-dimensional unit sphere
On the unit sphere in
Equivalence of Fisher-Rao Distance and Geodesic Distance of Square Root Densities
We are now ready to state the main result of this discussion. The Fisher-Rao distance between two distributions,
To verify Equation 2, consider an infinitesimal perturbation in
From Equation 4, we observe that
Since
By combining these observations with the fact that geodesic distances scale inversely with the metric factor, we conclude that the geodesic distance computed in the Fisher-Rao metric is twice the standard geodesic distance on the unit sphere in
Final Thoughts
- In the derivation of the Fisher-Rao distance in terms of the scaled geodesic distance on
, no specific choice or constraint is imposed on . This formulation naturally extends to nonparametric probability densities, as the geodesic distance in depends solely on the embedding. - Despite the equivalence, some concerns remain regarding the computability of the Fisher-Rao distance. For instance, evaluating the integral can be challenging in many cases, particularly for high-dimensional distributions or arbitrary nonparametric densities, where numerical integration becomes intractable. Approximation via Monte Carlo methods, while useful, presents its own set of difficulties. Moreover, when densities are estimated, they may contain regions of extremely small values, leading to numerical precision issues.
- I would like to express my appreciation to Prof. Marco Radeschi for his kindness and patience in enduring my endless questions in his differential geometry course - many of which, in hindsight, seem absurd.
References
Footnotes
Citation
@online{you2025,
author = {You, Kisung},
title = {What Is the {Fisher-Rao} Distance?},
date = {2025-02-05},
url = {https://kisungyou.com/Blog/blog_001_FisherRao.html},
langid = {en}
}