|

Lin’s Concordance Correlation Coefficient (CCC) in AI/ML

Anatomy of Cb: Why Two Terms for Scale, One Squared Term for Location, and a Numerator of Two

1. Starting Point: The CCC Denominator

To understand the bias correction factor $C_b$, we must first inspect the denominator of Lin’s Concordance Correlation Coefficient (CCC):

$$\rho_c = \frac{2\sigma_{xy}}{\sigma_x^2 + \sigma_y^2 + (\mu_x – \mu_y)^2}$$

The denominator contains exactly three additive components: $\sigma_x^2$, $\sigma_y^2$, and $(\mu_x – \mu_y)^2$. When Lin decomposed CCC into
$\rho_c = r \times C_b$, separating the Pearson correlation $r$ (precision) from $C_b$ (accuracy), the internal structure of these three terms determined
the final form of $C_b$.

2. Deriving Cb by Normalization

Dividing both the numerator and denominator of CCC by $2\sigma_x\sigma_y$:

$$\rho_c = \frac{\sigma_{xy} / (\sigma_x\sigma_y)}{[\sigma_x^2 + \sigma_y^2 + (\mu_x – \mu_y)^2] \;/\; (2\sigma_x\sigma_y)}$$

The numerator becomes $r$.
The denominator becomes:

$$\frac{\sigma_x^2}{2\sigma_x\sigma_y} + \frac{\sigma_y^2}{2\sigma_x\sigma_y} + \frac{(\mu_x – \mu_y)^2}{2\sigma_x\sigma_y} = \frac{v + 1/v + u^2}{2}$$

where $v = \sigma_x / \sigma_y$ and $u = (\mu_x – \mu_y) / \sqrt{\sigma_x \cdot \sigma_y}$.

Therefore:

$$\rho_c = \frac{r}{\;(v + 1/v + u^2)\;/\;2\;} = r \times \frac{2}{v + 1/v + u^2} = r \times C_b$$

$$C_b = \frac{2}{v + \frac{1}{v} + u^2}$$

3. Why Scale Shift Produces Two Terms (v + 1/v)

The CCC denominator contains two separate variance terms, $\sigma_x^2$ and $\sigma_y^2$, because each variable contributes its own spread independently. After normalization by $\sigma_x\sigma_y$, these become:

$$\frac{\sigma_x^2}{\sigma_x\sigma_y} = \frac{\sigma_x}{\sigma_y} = v, \qquad \frac{\sigma_y^2}{\sigma_x\sigma_y} = \frac{\sigma_y}{\sigma_x} = \frac{1}{v}$$

This two-term structure carries three important properties:

Symmetry. Swapping $X$ and $Y$ sends $v \to 1/v$, but the sum $v + 1/v$ remains unchanged. The metric is indifferent to which variable is labeled $X$ or $Y$.

Bounded minimum via AM-GM. By the arithmetic-geometric mean inequality:

$$v + \frac{1}{v} \geq 2\sqrt{v \cdot \frac{1}{v}} = 2$$

Equality holds if and only if $v = 1$, i.e., the two standard deviations are identical. Any departure from equal scales increases $v + 1/v$ beyond 2, thereby reducing $C_b$.

Symmetric penalization. Whether $\sigma_x$ is twice $\sigma_y$ or $\sigma_y$ is twice $\sigma_x$, the penalty is the same ($v + 1/v = 2.5$ in both cases). The two terms ensure that scale discrepancy is penalized regardless of direction.

4. Why Location Shift Appears as a Squared Term (u²)

The location shift enters the CCC denominator as $(\mu_x – \mu_y)^2$, a single squared term. After normalization by $\sigma_x\sigma_y$, it becomes $u^2$. Three reasons explain why the square is necessary:

Algebraic origin. CCC is fundamentally related to the expected squared difference between $X$ and $Y$:

$$E[(X – Y)^2] = \sigma_x^2 + \sigma_y^2 – 2\sigma_{xy} + (\mu_x – \mu_y)^2$$

In this expansion, the mean difference inevitably appears as a squared term. This is not a design choice but a mathematical consequence of working with
second-moment quantities.

Direction invariance. Whether $\mu_x > \mu_y$ or $\mu_x < \mu_y$, only the magnitude of the bias should affect agreement. Squaring eliminates the sign, ensuring that a positive bias of 5 and a negative bias of 5 produce the same $C_b$ value.

Dimensional consistency. The other terms in the denominator ($\sigma_x^2$ and $\sigma_y^2$) have units of variance (squared units). For all three terms to be additive, the mean difference must also be squared to match this dimension.

5. Why the Numerator Is Exactly 2

The numerator of $C_b$ is the constant 2. This value is not arbitrary; it is dictated by the requirement that $C_b = 1$ under perfect agreement and by the algebraic structure of CCC itself.

From the normalization perspective: The denominator $v + 1/v + u^2$ achieves its minimum value of 2 when $v = 1$ and $u = 0$ (perfect scale match and zero location bias). To ensure $C_b = 1$ at this minimum:

$$C_b^{\max} = \frac{\text{numerator}}{2} = 1 \quad \Longrightarrow \quad \text{numerator} = 2$$

Any other numerator would shift the maximum away from 1. For instance, a numerator of 1 would yield $C_b^{\max} = 0.5$, destroying the intuitive $[0, 1]$ interpretation.

From the algebraic derivation: When factoring CCC as $r \times C_b$, the numerator $2\sigma_{xy}$ splits into $r = \sigma_{xy}/(\sigma_x\sigma_y)$ and $2\sigma_x\sigma_y$. Normalizing $2\sigma_x\sigma_y$ by $\sigma_x\sigma_y$ leaves exactly 2. The constant emerges naturally from the factorization rather
than being imposed externally.

Both perspectives converge: The algebraic decomposition produces a numerator of 2, and this is precisely the value needed to normalize $C_b$ into the range $(0, 1]$. This is not a coincidence. It reflects the internal consistency of CCC’s mathematical structure: the coefficient 2 in $2\sigma_{xy}$ is the same 2 that bounds $v + 1/v$ from below.

6. Summary Table

ComponentCCC Denominator TermAfter NormalizationRangeRole in $C_b$
Scale shift$\sigma_x^2 + \sigma_y^2$ (2 terms)$v + 1/v$$[2, \infty)$Penalizes unequal variability
Location shift$(\mu_x – \mu_y)^2$ (1 term)$u^2$$[0, \infty)$Penalizes shifted center
Denominator totalAll three summed$v + 1/v + u^2$$[2, \infty)$Combined disagreement measure
NumeratorFrom $2\sigma_{xy}$$2$FixedNormalizes $C_b$ to $(0, 1]$

Under perfect agreement ($v = 1$, $u = 0$), the denominator equals 2 and $C_b = 2/2 = 1$. As either scale or location disagreement grows, the denominator increases without bound and $C_b$ approaches zero. Every constant and every exponent in the formula traces back to a specific structural requirement: imensional consistency, directional symmetry, bounded range, or algebraic necessity.

Our Score
Click to rate this post!
[Total: 0 Average: 0]
Visited 57 times, 1 visit(s) today

Leave a Comment

Your email address will not be published. Required fields are marked *