KL Divergence
Requirements for KL Divergence
Consistent Sample Space:
- Both distributions P and Q must be defined over the same sample space. This means their domain must be consistent, and they should be comparable within the same sample space.
Support Coverage:
- The support of distribution Q must cover the support of distribution P. In other words, if at any point x, P(x) ≠ 0, then at the same point x, Q(x) must be greater than 0. This ensures that the calculation of KL divergence does not encounter undefined logarithms (log 0).
Calculation Process of KL Divergence
Discrete Distributions
- For discrete distributions, the KL divergence is calculated by independently measuring the divergence at each point in the sample space and then summing these values:
- This formula computes the ratio of P(x) and Q(x) at each point x, takes the logarithm of this ratio, multiplies by P(x), and finally sums the results over all points.
Continuous Distributions
- For continuous distributions, the calculation is similar but uses integration instead of summation:
- Here, p(x) and q(x) are the probability density functions of P and Q, respectively. The divergence is integrated over the entire sample space.
Summary
The calculation of KL divergence requires a consistent sample space and the support of the approximate distribution to cover the support of the target distribution. The specific calculation involves independently measuring the divergence at each sample point (i.e., taking the ratio of P(x) and Q(x), taking the logarithm, and multiplying by P(x)), then summing these values for discrete cases or integrating for continuous cases. This provides a measure of the difference between two probability distributions.