KL Divergence and FID
KL Divergence (Kullback-Leibler Divergence)
Definition: KL Divergence is a measure of how one probability distribution diverges from a second, expected probability distribution.
Mathematical Formula:
or in the continuous case:
Characteristics:
- Non-symmetric:
- Non-negative: Always non-negative and zero if and only if
- Sensitive to the support of ( Q ): If
and
, KL Divergence goes to infinity.
- Measures relative entropy: Quantifies the amount of information lost when
is used to approximate
.
Uses:
- Model evaluation: Comparing theoretical distributions with empirical distributions.
- Information theory: Quantifying information gain in Bayesian updating.
- Optimization: Often used in the training of machine learning models, such as in Variational Autoencoders (VAEs).
FID (Frechet Inception Distance)
Definition: FID measures the distance between two distributions of images by comparing their feature representations obtained from a pre-trained network (typically Inception-v3).
Mathematical Formula:
where and
are the mean and covariance of real and generated image features, respectively.
Characteristics:
- Symmetric:
- Robust: More robust to noise and minor variations in the data.
- Sensitive to both quality and diversity: Captures discrepancies in both the mean and the spread of feature distributions.
- Uses pre-trained model: Relies on Inception-v3 for feature extraction, providing a standardized comparison.
Uses:
- GAN evaluation: Widely used to evaluate the performance of Generative Adversarial Networks.
- Image synthesis: Measuring the quality and diversity of generated images.
- Model comparison: Comparing different generative models or different training strategies.
Comparison Table
| Feature | KL Divergence | FID (Frechet Inception Distance) |
|---|---|---|
| Symmetry | Non-symmetric | Symmetric |
| Sensitivity to Support | Highly sensitive to zero probabilities | Robust to support mismatches |
| Measures | Relative entropy | Distance in feature space |
| Application Domain | Information theory, model optimization | Generative model evaluation |
| Computation | Direct probability comparison | Feature-based comparison using pre-trained model |
| Typical Uses | Variational Autoencoders, Bayesian methods | GAN evaluation, image quality assessment |
| Robustness | Can be unstable with non-overlapping supports | Robust to noise and minor variations |
| Quality vs Diversity | Less effective in capturing diversity | Captures both quality and diversity |
Conclusion
KL Divergence and FID are both valuable metrics but serve different purposes. KL Divergence is primarily used in theoretical contexts and model optimization tasks where direct probability comparisons are feasible. FID, on the other hand, is tailored for evaluating generative models, especially in terms of image quality and diversity, and is more robust and symmetric, making it a preferred choice in practical applications involving image synthesis and generative models.