KL Divergence and FID

KL Divergence (Kullback-Leibler Divergence)

Definition: KL Divergence is a measure of how one probability distribution diverges from a second, expected probability distribution.

Mathematical Formula:

KL Divergence Formula

or in the continuous case:

Continuous KL Divergence Formula

Characteristics:

  • Non-symmetric: Non-symmetric
  • Non-negative: Always non-negative and zero if and only if Non-negative
  • Sensitive to the support of ( Q ): If Sensitive to Support and Sensitive to Support, KL Divergence goes to infinity.
  • Measures relative entropy: Quantifies the amount of information lost when Q is used to approximate P.

Uses:

  • Model evaluation: Comparing theoretical distributions with empirical distributions.
  • Information theory: Quantifying information gain in Bayesian updating.
  • Optimization: Often used in the training of machine learning models, such as in Variational Autoencoders (VAEs).

FID (Frechet Inception Distance)

Definition: FID measures the distance between two distributions of images by comparing their feature representations obtained from a pre-trained network (typically Inception-v3).

Mathematical Formula:

FID Formula

where Mean and Covariance and Mean and Covariance are the mean and covariance of real and generated image features, respectively.

Characteristics:

  • Symmetric: Symmetric
  • Robust: More robust to noise and minor variations in the data.
  • Sensitive to both quality and diversity: Captures discrepancies in both the mean and the spread of feature distributions.
  • Uses pre-trained model: Relies on Inception-v3 for feature extraction, providing a standardized comparison.

Uses:

  • GAN evaluation: Widely used to evaluate the performance of Generative Adversarial Networks.
  • Image synthesis: Measuring the quality and diversity of generated images.
  • Model comparison: Comparing different generative models or different training strategies.

Comparison Table

Feature KL Divergence FID (Frechet Inception Distance)
Symmetry Non-symmetric Symmetric
Sensitivity to Support Highly sensitive to zero probabilities Robust to support mismatches
Measures Relative entropy Distance in feature space
Application Domain Information theory, model optimization Generative model evaluation
Computation Direct probability comparison Feature-based comparison using pre-trained model
Typical Uses Variational Autoencoders, Bayesian methods GAN evaluation, image quality assessment
Robustness Can be unstable with non-overlapping supports Robust to noise and minor variations
Quality vs Diversity Less effective in capturing diversity Captures both quality and diversity

Conclusion

KL Divergence and FID are both valuable metrics but serve different purposes. KL Divergence is primarily used in theoretical contexts and model optimization tasks where direct probability comparisons are feasible. FID, on the other hand, is tailored for evaluating generative models, especially in terms of image quality and diversity, and is more robust and symmetric, making it a preferred choice in practical applications involving image synthesis and generative models.