Statistical Notation for Conditioning (|) and Parameterization (;)
Home / Forums / AI & ML: Learn It Yourself / Linear Algebra / Statistical Notation for Conditioning (|) and Parameterization (;)
- This topic has 0 replies, 1 voice, and was last updated 3 months ago by
Wolf.
-
AuthorPosts
-
Understanding Probability Notation for AI Learners
In the context of Variational Autoencoders (VAEs) and Bayesian Deep Learning, we often see expressions like:
$$q_{\phi}(\mathbf{z} | \mathbf{x}) = \mathcal{N}(\mathbf{z}; \boldsymbol{\mu}, \boldsymbol{\Sigma})$$
This equation describes how an encoder maps an input $\mathbf{x}$ to a latent distribution $q$. Here is the breakdown of the specific symbols used to separate variables.
1. The Conditioning Bar ($|$)
The vertical bar $|$ denotes conditional probability. It represents a relationship between two random variables.
- Meaning: $P(A | B)$ means “The probability of $A$ occurring, given that $B$ has already occurred or is known.”
- In our example: $q(\mathbf{z} | \mathbf{x})$ means we are looking for the probability of a latent code $\mathbf{z}$ specifically for a given input image or data point $\mathbf{x}$.
- AI Context: This represents the Encoder. The latent vector $\mathbf{z}$ is “conditioned” on the input data $\mathbf{x}$.
2. The Parameter Semicolon ($;$)
The semicolon $;$ is used to separate the random variables from the fixed parameters that define the shape of the distribution.
- Meaning: Everything to the left of the semicolon is a variable we are evaluating; everything to the right is a constant (parameter) that tells the function how to behave.
- In our example: $\mathcal{N}(\mathbf{z}; \boldsymbol{\mu}, \boldsymbol{\Sigma})$ says “Evaluate the Gaussian density at point $\mathbf{z}$, using a specific mean $\boldsymbol{\mu}$ and covariance $\boldsymbol{\Sigma}$.”
- AI Context: Parameters are the “weights” or “settings.” While $\mathbf{z}$ changes during sampling, $\boldsymbol{\mu}$ and $\boldsymbol{\Sigma}$ are the outputs produced by your neural network layers.
Comparison Summary
Symbol Name Relationship Type Example Role $|$ Vertical Bar Variable to Variable Links input $\mathbf{x}$ to latent $\mathbf{z}$. $;$ Semicolon Variable to Parameter Links $\mathbf{z}$ to its distribution shape ($\boldsymbol{\mu}, \boldsymbol{\Sigma}$). $_{\phi}$ Subscript Function to Parameter Links the model $q$ to its learnable weights $\phi$.
Mathematical Structure
In deep learning, $\boldsymbol{\mu}$ and $\boldsymbol{\Sigma}$ are typically vectors/matrices generated by a network. For a 2D latent space, they might look like this:
$$\boldsymbol{\mu} = \begin{pmatrix} \mu_1 \cr \mu_2 \end{pmatrix}, \boldsymbol{\Sigma} = \begin{pmatrix} \sigma_{11} & \sigma_{12} \cr \sigma_{21} & \sigma_{22} \end{pmatrix}$$
By using the semicolon, we clarify that we aren’t “conditioning” on $\boldsymbol{\mu}$ in a probabilistic sense (treating it as a random variable), but rather using it as a fixed coefficient for the calculation.
-
This topic was modified 3 months ago by
Wolf.
-
AuthorPosts
- You must be logged in to reply to this topic.
