Statistical Notation for Conditioning (|) and Parameterization (;)

Home / Forums / AI & ML: Learn It Yourself / Linear Algebra / Statistical Notation for Conditioning (|) and Parameterization (;)

  • Author
    Posts
    • February 16, 2026 at 1:31 pm #5461

      Understanding Probability Notation for AI Learners

      In the context of Variational Autoencoders (VAEs) and Bayesian Deep Learning, we often see expressions like:

      $$q_{\phi}(\mathbf{z} | \mathbf{x}) = \mathcal{N}(\mathbf{z}; \boldsymbol{\mu}, \boldsymbol{\Sigma})$$

      This equation describes how an encoder maps an input $\mathbf{x}$ to a latent distribution $q$. Here is the breakdown of the specific symbols used to separate variables.


      1. The Conditioning Bar ($|$)

      The vertical bar $|$ denotes conditional probability. It represents a relationship between two random variables.

      • Meaning: $P(A | B)$ means “The probability of $A$ occurring, given that $B$ has already occurred or is known.”
      • In our example: $q(\mathbf{z} | \mathbf{x})$ means we are looking for the probability of a latent code $\mathbf{z}$ specifically for a given input image or data point $\mathbf{x}$.
      • AI Context: This represents the Encoder. The latent vector $\mathbf{z}$ is “conditioned” on the input data $\mathbf{x}$.

      2. The Parameter Semicolon ($;$)

      The semicolon $;$ is used to separate the random variables from the fixed parameters that define the shape of the distribution.

      • Meaning: Everything to the left of the semicolon is a variable we are evaluating; everything to the right is a constant (parameter) that tells the function how to behave.
      • In our example: $\mathcal{N}(\mathbf{z}; \boldsymbol{\mu}, \boldsymbol{\Sigma})$ says “Evaluate the Gaussian density at point $\mathbf{z}$, using a specific mean $\boldsymbol{\mu}$ and covariance $\boldsymbol{\Sigma}$.”
      • AI Context: Parameters are the “weights” or “settings.” While $\mathbf{z}$ changes during sampling, $\boldsymbol{\mu}$ and $\boldsymbol{\Sigma}$ are the outputs produced by your neural network layers.

      Comparison Summary

      Symbol Name Relationship Type Example Role
      $|$ Vertical Bar Variable to Variable Links input $\mathbf{x}$ to latent $\mathbf{z}$.
      $;$ Semicolon Variable to Parameter Links $\mathbf{z}$ to its distribution shape ($\boldsymbol{\mu}, \boldsymbol{\Sigma}$).
      $_{\phi}$ Subscript Function to Parameter Links the model $q$ to its learnable weights $\phi$.

      Mathematical Structure

      In deep learning, $\boldsymbol{\mu}$ and $\boldsymbol{\Sigma}$ are typically vectors/matrices generated by a network. For a 2D latent space, they might look like this:

      $$\boldsymbol{\mu} = \begin{pmatrix} \mu_1 \cr \mu_2 \end{pmatrix}, \boldsymbol{\Sigma} = \begin{pmatrix} \sigma_{11} & \sigma_{12} \cr \sigma_{21} & \sigma_{22} \end{pmatrix}$$

      By using the semicolon, we clarify that we aren’t “conditioning” on $\boldsymbol{\mu}$ in a probabilistic sense (treating it as a random variable), but rather using it as a fixed coefficient for the calculation.

      • This topic was modified 3 months ago by Wolf.
    • You must be logged in to reply to this topic.