Center Alignment Index (CAI): A Novel Metric for Evaluating Data Center Agreement on the 1:1 Line
The Lorentzian Shape of CAI: Sensitivity Analysis and the Tolerance Parameter k
Why CAI Is a Lorentzian Function
The functional form of CAI, $1/(1 + u^2)$, is identical to the normalized Lorentzian (Cauchy kernel) peaked at the origin. This is not an arbitrary design choice but an algebraic consequence of CCC’s structure. The CCC denominator $\sigma_x^2 + \sigma_y^2 + (\mu_x – \mu_y)^2$ is quadratic in the mean difference. When we isolate the location component by fixing the scale ratio $v = 1$, the bias correction factor becomes:
$$C_b\big|_{v=1} = \frac{2}{2 + u^2} = \frac{1}{1 + u^2/2}$$
CAI adopts a sharper variant $1/(1 + u^2)$, but the Lorentzian shape itself is inherited directly from the quadratic denominator of CCC. The reciprocal of any positive-definite quadratic in $u$ necessarily produces a Lorentzian profile. In other words, the Lorentzian form is not chosen for CAI; it is derived from CCC.
This functional form carries several practical advantages. First, the peak value is exactly 1 at $u = 0$ and the function decays monotonically toward 0 as $|u| \to \infty$, matching the semantic requirement of an agreement index. Second, compared to a Gaussian kernel $e^{-u^2}$, the Lorentzian exhibits heavier tails, maintaining discriminative power over a wider range of bias values:
| $u$ | Lorentzian $1/(1+u^2)$ | Gaussian $e^{-u^2}$ |
|---|---|---|
| 0.0 | 1.000 | 1.000 |
| 1.0 | 0.500 | 0.368 |
| 2.0 | 0.200 | 0.018 |
| 3.0 | 0.100 | 0.0001 |
At $u = 2$, the Gaussian has essentially collapsed to zero (0.018), whereas the Lorentzian still returns 0.200, providing meaningful gradation. Third, the closed-form inverse $u = \sqrt{1/c – 1}$ for a given CAI value $c$ simplifies error propagation and analytical interpretation.
Sensitivity Concern: Is CAI Too Aggressive?
Despite the heavier tails relative to the Gaussian, a closer inspection of CAI’s behavior reveals a potential problem. Since $u \approx \Delta\mu / \sigma$ when $\sigma_x \approx \sigma_y$, the location shift $u$ can be interpreted as the bias measured in units of the geometric standard deviation. The resulting CAI values drop steeply:
| Bias Magnitude | $u$ | CAI | Practical Judgment |
|---|---|---|---|
| $0.5\sigma$ | 0.5 | 0.800 | Acceptable in most domains |
| $1.0\sigma$ | 1.0 | 0.500 | Half-power — already harsh |
| $1.5\sigma$ | 1.5 | 0.308 | Flagged as poor |
| $2.0\sigma$ | 2.0 | 0.200 | Nearly rejected |
A mean bias of $1\sigma$ is not uncommon in remote sensing retrievals, climate model outputs, or cross-sensor comparisons. Yet CAI assigns it a score of 0.500, which reads as “50% disagreement.” This severity stems from the fact that CAI uses $u^2$ in the denominator, whereas the original $C_b$ uses $u^2/2$, effectively doubling the penalization rate.
Introducing the Tolerance Parameter k
To address this, we generalize CAI with a tunable tolerance parameter $k > 0$:
$$CAI_k = \frac{1}{1 + u^2 / k}$$
The parameter $k$ controls the half-power point: CAI$_k = 0.5$ occurs at $u = \sqrt{k}$. A larger $k$ makes the metric more tolerant of bias; a smaller $k$ makes it stricter. The following table illustrates the effect:
| $u$ | $k=1$ (strict) | $k=2$ (moderate) | $k=4$ (tolerant) |
|---|---|---|---|
| 0.5 | 0.800 | 0.889 | 0.941 |
| 1.0 | 0.500 | 0.667 | 0.800 |
| 2.0 | 0.200 | 0.333 | 0.500 |
| 3.0 | 0.100 | 0.182 | 0.308 |
Setting $k = 2$ exactly recovers the decay rate of Lin’s $C_b|_{v=1}$, providing a theoretically grounded default. Setting $k = 4$ shifts the half-power point to $2\sigma$, which may be appropriate for domains where moderate systematic bias is operationally acceptable.
The recommended notation is $CAI_k$ or $CAI(u;\,k)$, with the following formal definition:
$CAI_k = 1/(1 + u^2/k)$, where $k > 0$ is the tolerance parameter governing sensitivity to location bias. The half-power point occurs at $u = \sqrt{k}$.
Benchmarking: Tunable Parameters in AI/ML Evaluation Metrics
The introduction of a tolerance parameter into an evaluation metric is a well-established practice in AI/ML. Several widely used metrics follow the same design principle: a core formula whose behavior is modulated by one or more parameters that encode domain-specific cost structures.
$F_\beta$ Score is perhaps the most familiar example. The formula $F_\beta = (1+\beta^2) \cdot PR / (\beta^2 P + R)$ uses $\beta^2$ to weight recall relative to precision. At $\beta = 1$, precision and recall contribute equally ($F_1$). In medical screening, $\beta = 2$ is standard because missing a true positive (false negative) is far more costly than a false alarm. In spam filtering, $\beta = 0.5$ prioritizes precision because users strongly dislike legitimate emails being discarded.
Focal Loss addresses extreme class imbalance in object detection through $FL = -\alpha_t (1 – p_t)^\gamma \log(p_t)$. The focusing parameter $\gamma$ down-weights easy examples: at $\gamma = 0$ the formula reduces to standard cross-entropy, while at $\gamma = 2$ (the recommended default from the RetinaNet paper) a well-classified sample with $p_t = 0.9$ contributes only $(0.1)^2 = 1\%$ of its original loss. This single parameter transforms the loss landscape from one dominated by easy negatives to one focused on hard examples.
Tversky Index generalizes the Dice and Jaccard coefficients for image segmentation: $TI = TP / (TP + \alpha \cdot FP + \beta \cdot FN)$. Setting $\alpha = \beta = 0.5$ recovers Dice; $\alpha = \beta = 1$ recovers Jaccard. In medical image segmentation for small lesion detection, practitioners set $\beta > \alpha$ to penalize false negatives more heavily, reducing the risk of missing pathological regions.
Huber Loss smoothly interpolates between MSE and MAE via the threshold $\delta$: quadratic for $|r| \leq \delta$, linear beyond. As $\delta \to \infty$, Huber converges to MSE; as $\delta \to 0$, it converges to MAE. The parameter $\delta$ encodes how much residual magnitude the practitioner considers “normal” versus “outlier.”
Quantile Loss uses $\tau \in (0,1)$ to impose asymmetric penalties on over- and under-prediction. At $\tau = 0.5$ it equals MAE; at $\tau = 0.9$ it penalizes under-prediction nine times more than over-prediction. This is essential for constructing prediction intervals and risk-aware forecasts.
Common Design Pattern
These metrics share a structural template that CAI$_k$ follows precisely:
| Metric | Parameter | Controls | Recovers At |
|---|---|---|---|
| $CAI_k$ | $k$ | Bias tolerance | $C_b$ |
| $F_\beta$ | $\beta$ | Recall vs. Precision weight | $F_1$ at $\beta=1$ |
| Focal Loss | $\gamma$ | Easy-sample suppression | CE at $\gamma=0$ |
| Tversky Index | $\alpha,\beta$ | FP vs. FN cost | Dice at $\alpha=\beta=0.5$ |
| Huber Loss | $\delta$ | Outlier tolerance | MSE at $\delta\to\infty$ |
| Quantile Loss | $\tau$ | Over/under-prediction asymmetry | MAE at $\tau=0.5$ |
In every case, (1) a specific parameter value recovers a well-known existing metric, (2) the parameter encodes domain-specific cost or tolerance, and (3) the core mathematical structure remains unchanged across parameter values. CAI$_k$ inherits all three properties: $k=2$ recovers $C_b|_{v=1}$, $k$ reflects the domain’s acceptable bias level, and the Lorentzian form is preserved for all $k > 0$.
