Normalizing Flows (NF)
Normalizing Flows (NF) achieve modeling of complex distributions in high-dimensional space without the need to reduce dimensions. This is accomplished by transforming a simple distribution into a complex distribution of the same dimensionality through multiple invertible transformation functions. The probability density at a point in the complex distribution is calculated using the probability density of the simple distribution and the transformation functions.
Key Concepts
-
Invertible Transformations: Each transformation in the NF must be invertible, allowing for the original data to be precisely recovered from the transformed data.
-
Density Transformation: Using the transformed data to compute the probability density of the original data, achieved through the Jacobian determinant of the transformation.
-
Composite Transformations: Multiple simple transformations are composed together to form a complex transformation flow. Each simple transformation is invertible and has a known Jacobian determinant.
Mathematical Representation
Given data , we aim to map it to a simple distribution
(usually standard Gaussian) through a series of invertible transformations
:
Since these transformations are invertible, we can map back to the original data space:
To calculate the probability density of , we use the Jacobian determinant of the transformation:
where:
There are several things to note here:
and
need to be continuous and have the same dimension.
is a matrix of dimension
, where each entry at location
is defined as
. This matrix is also known as the Jacobian matrix.
denotes the determinant of a square matrix
.
- For any invertible matrix
,
, so for
we have:
- If
, then the mapping is volume preserving, which means that the transformed distribution
will have the same “volume” compared to the original one
.
Applications
Normalization flows have several applications, including:
- Generative Models: Generating new data samples similar to the training data.
- Probability Density Estimation: Precisely estimating the probability density of complex data.
- Data Transformation and Preprocessing: Applying complex, non-linear transformations to data for better modeling in data science and machine learning.
Combining VAE, VI, and NF
Variational Autoencoders (VAEs) combine Variational Inference (VI) and Normalizing Flows (NF) to enhance the modeling of latent variables. This combination, known as NF-VAE or Flow-VAE, improves the expressiveness of the latent space.
Key Points
-
Dimension Consistency: Each transformation function
in NF must have the same input and output dimensions to ensure invertibility.
-
Invertibility: The transformation
must be invertible, with a well-defined inverse transformation
.
-
Jacobian Determinant: Computing the Jacobian determinant of the transformation is essential for adjusting the probability density of the transformed data.
Typical Transformation Functions
-
Affine Coupling Layer: Applies an affine transformation to part of the input vector while combining it with another part.
-
Planar Flow: Introduces an invertible transformation with specific parameters to ensure invertibility and simplicity in computing the Jacobian determinant.
-
Nonlinear Independent Components Estimation (NICE): Utilizes additive coupling layers and rescaling layers to achieve invertible transformations with simple analytical forms.
-
Real Non-Volume Preserving (RealNVP): Extends NICE by incorporating
-
Masked Autoregressive Flow (MAF): Uses an autoregressive model for forward mapping, making sampling sequential and efficient.
-
Inverse Autoregressive Flow (IAF): Inverts the generating process to parallelize sampling while maintaining efficient likelihood computation for generated points.
-
ActNorm: Shifts and scales each dimension for normalization.
Conclusion
Normalizing flows provide a powerful method to represent, model, and sample complex high-dimensional distributions without directly reducing dimensionality. They transform the probability density of a simple distribution into that of a complex distribution through a series of invertible transformations, enabling precise modeling and sample generation of complex distributions. This approach is widely used in generative models, probability density estimation, and other machine learning tasks.