{"id":6561,"date":"2026-05-03T12:48:49","date_gmt":"2026-05-03T17:48:49","guid":{"rendered":"https:\/\/ykim.synology.me\/wordpress\/?p=6561"},"modified":"2026-05-04T02:46:23","modified_gmt":"2026-05-04T07:46:23","slug":"are-missing-path-samples-in-tree-based-models-ood","status":"publish","type":"post","link":"https:\/\/ykim.synology.me\/wordpress\/are-missing-path-samples-in-tree-based-models-ood-6561\/","title":{"rendered":"Are Missing-Path Samples in Tree-Based Models OOD?"},"content":{"rendered":"\n<figure class=\"wp-block-image size-full is-resized\"><img loading=\"lazy\" decoding=\"async\" width=\"900\" height=\"600\" src=\"https:\/\/ykim.synology.me\/wordpress\/wp-content\/uploads\/2026\/05\/2025122x-Endless-Blue-Sky-Santa-Fe-Highway-900x600px.jpg\" alt=\"\" class=\"wp-image-6562\" style=\"width:600px\" srcset=\"https:\/\/ykim.synology.me\/wordpress\/wp-content\/uploads\/2026\/05\/2025122x-Endless-Blue-Sky-Santa-Fe-Highway-900x600px.jpg 900w, https:\/\/ykim.synology.me\/wordpress\/wp-content\/uploads\/2026\/05\/2025122x-Endless-Blue-Sky-Santa-Fe-Highway-900x600px-300x200.jpg 300w, https:\/\/ykim.synology.me\/wordpress\/wp-content\/uploads\/2026\/05\/2025122x-Endless-Blue-Sky-Santa-Fe-Highway-900x600px-768x512.jpg 768w\" sizes=\"auto, (max-width: 900px) 100vw, 900px\" \/><\/figure>\n\n\n<style>.kadence-column6561_85b3ea-05 > .kt-inside-inner-col,.kadence-column6561_85b3ea-05 > .kt-inside-inner-col:before{border-top-left-radius:0px;border-top-right-radius:0px;border-bottom-right-radius:0px;border-bottom-left-radius:0px;}.kadence-column6561_85b3ea-05 > .kt-inside-inner-col{column-gap:var(--global-kb-gap-sm, 1rem);}.kadence-column6561_85b3ea-05 > .kt-inside-inner-col{flex-direction:column;}.kadence-column6561_85b3ea-05 > .kt-inside-inner-col > .aligncenter{width:100%;}.kadence-column6561_85b3ea-05 > .kt-inside-inner-col:before{opacity:0.3;}.kadence-column6561_85b3ea-05{position:relative;}.kadence-column6561_85b3ea-05, .kt-inside-inner-col > .kadence-column6561_85b3ea-05:not(.specificity){margin-left:var(--global-kb-spacing-sm, 1.5rem);}@media all and (max-width: 1024px){.kadence-column6561_85b3ea-05 > .kt-inside-inner-col{flex-direction:column;justify-content:center;}}@media all and (max-width: 767px){.kadence-column6561_85b3ea-05 > .kt-inside-inner-col{flex-direction:column;justify-content:center;}}<\/style>\n<div class=\"wp-block-kadence-column kadence-column6561_85b3ea-05\"><div class=\"kt-inside-inner-col\">\n<h2 class=\"wp-block-heading\">Bottom Line<\/h2>\n\n\n\n<p class=\"wp-block-paragraph\"><strong>Strictly speaking, no \u2014 but in practice, treat them as Out-of-Distribution (OOD).<\/strong><\/p>\n\n\n\n<p class=\"wp-block-paragraph\">Missing-path samples in tree-based boosting models such as LightGBM, CatBoost, and XGBoost do not match the academic definition of OOD perfectly, yet they carry essentially the same risk in deployed systems. This post explains why this is a borderline case, surveys five concrete detection techniques applicable to all three libraries, and shows how detection outputs feed directly into five mitigation strategies \u2014 with code throughout.<\/p>\n<\/div><\/div>\n\n\n\n<h2 class=\"wp-block-heading\">1. What Is a Missing-Path Sample?<\/h2>\n\n\n<style>.kadence-column6561_b4ddde-75 > .kt-inside-inner-col,.kadence-column6561_b4ddde-75 > .kt-inside-inner-col:before{border-top-left-radius:0px;border-top-right-radius:0px;border-bottom-right-radius:0px;border-bottom-left-radius:0px;}.kadence-column6561_b4ddde-75 > .kt-inside-inner-col{column-gap:var(--global-kb-gap-sm, 1rem);}.kadence-column6561_b4ddde-75 > .kt-inside-inner-col{flex-direction:column;}.kadence-column6561_b4ddde-75 > .kt-inside-inner-col > .aligncenter{width:100%;}.kadence-column6561_b4ddde-75 > .kt-inside-inner-col:before{opacity:0.3;}.kadence-column6561_b4ddde-75{position:relative;}.kadence-column6561_b4ddde-75, .kt-inside-inner-col > .kadence-column6561_b4ddde-75:not(.specificity){margin-left:var(--global-kb-spacing-sm, 1.5rem);}@media all and (max-width: 1024px){.kadence-column6561_b4ddde-75 > .kt-inside-inner-col{flex-direction:column;justify-content:center;}}@media all and (max-width: 767px){.kadence-column6561_b4ddde-75 > .kt-inside-inner-col{flex-direction:column;justify-content:center;}}<\/style>\n<div class=\"wp-block-kadence-column kadence-column6561_b4ddde-75\"><div class=\"kt-inside-inner-col\">\n<p class=\"wp-block-paragraph\">&#8220;Missing path&#8221; has two meanings in tree models:<\/p>\n\n\n\n<ul class=\"wp-block-list\">\n<li><strong>Meaning 1 \u2014 A branch path never created during training.<\/strong> If the combination &#8220;age &gt; 60 AND income &lt; 20K&#8221; never appeared in the training data, the tree had no chance to learn a branch reaching that combination. At inference, such samples are forced into the nearest existing leaf.<\/li>\n\n\n\n<li><strong>Meaning 2 \u2014 Missing-value (NaN) routing.<\/strong> XGBoost&#8217;s default direction or LightGBM&#8217;s <code>use_missing<\/code> option encodes which side a NaN goes to. This is an explicit signal, not an OOD problem.<\/li>\n<\/ul>\n\n\n\n<p class=\"wp-block-paragraph\">This post focuses on <strong>Meaning 1<\/strong> \u2014 the side that overlaps ambiguously with OOD and poses the greater operational risk.<\/p>\n<\/div><\/div>\n\n\n\n<h2 class=\"wp-block-heading\">2. Why It Is Not Strictly OOD<\/h2>\n\n\n<style>.kadence-column6561_f89d90-37 > .kt-inside-inner-col,.kadence-column6561_f89d90-37 > .kt-inside-inner-col:before{border-top-left-radius:0px;border-top-right-radius:0px;border-bottom-right-radius:0px;border-bottom-left-radius:0px;}.kadence-column6561_f89d90-37 > .kt-inside-inner-col{column-gap:var(--global-kb-gap-sm, 1rem);}.kadence-column6561_f89d90-37 > .kt-inside-inner-col{flex-direction:column;}.kadence-column6561_f89d90-37 > .kt-inside-inner-col > .aligncenter{width:100%;}.kadence-column6561_f89d90-37 > .kt-inside-inner-col:before{opacity:0.3;}.kadence-column6561_f89d90-37{position:relative;}.kadence-column6561_f89d90-37, .kt-inside-inner-col > .kadence-column6561_f89d90-37:not(.specificity){margin-left:var(--global-kb-spacing-sm, 1.5rem);}@media all and (max-width: 1024px){.kadence-column6561_f89d90-37 > .kt-inside-inner-col{flex-direction:column;justify-content:center;}}@media all and (max-width: 767px){.kadence-column6561_f89d90-37 > .kt-inside-inner-col{flex-direction:column;justify-content:center;}}<\/style>\n<div class=\"wp-block-kadence-column kadence-column6561_f89d90-37\"><div class=\"kt-inside-inner-col\">\n<p class=\"wp-block-paragraph\">OOD is defined as input $x$ falling outside the support of $P_{\\text{train}}(x)$. Missing-path samples differ subtly from this.<\/p>\n\n\n\n<h3 class=\"wp-block-heading\">2.1 Marginal vs Joint Support<\/h3>\n\n\n<style>.kadence-column6561_c907a4-e2 > .kt-inside-inner-col,.kadence-column6561_c907a4-e2 > .kt-inside-inner-col:before{border-top-left-radius:0px;border-top-right-radius:0px;border-bottom-right-radius:0px;border-bottom-left-radius:0px;}.kadence-column6561_c907a4-e2 > .kt-inside-inner-col{column-gap:var(--global-kb-gap-sm, 1rem);}.kadence-column6561_c907a4-e2 > .kt-inside-inner-col{flex-direction:column;}.kadence-column6561_c907a4-e2 > .kt-inside-inner-col > .aligncenter{width:100%;}.kadence-column6561_c907a4-e2 > .kt-inside-inner-col:before{opacity:0.3;}.kadence-column6561_c907a4-e2{position:relative;}.kadence-column6561_c907a4-e2, .kt-inside-inner-col > .kadence-column6561_c907a4-e2:not(.specificity){margin-left:var(--global-kb-spacing-sm, 1.5rem);}@media all and (max-width: 1024px){.kadence-column6561_c907a4-e2 > .kt-inside-inner-col{flex-direction:column;justify-content:center;}}@media all and (max-width: 767px){.kadence-column6561_c907a4-e2 > .kt-inside-inner-col{flex-direction:column;justify-content:center;}}<\/style>\n<div class=\"wp-block-kadence-column kadence-column6561_c907a4-e2\"><div class=\"kt-inside-inner-col\">\n<p class=\"wp-block-paragraph\">If <code>age=70<\/code> and <code>income=15K<\/code> both occurred in training individually, the marginal supports cover the sample. Only the <strong>combination<\/strong> (joint distribution) is unseen. This is sometimes called <strong>combinatorial OOD<\/strong> or an <strong>interpolation gap<\/strong>.<\/p>\n<\/div><\/div>\n\n\n\n<h3 class=\"wp-block-heading\">2.2 A Structural Limit of Tree Partitioning<\/h3>\n\n\n<style>.kadence-column6561_417581-a6 > .kt-inside-inner-col,.kadence-column6561_417581-a6 > .kt-inside-inner-col:before{border-top-left-radius:0px;border-top-right-radius:0px;border-bottom-right-radius:0px;border-bottom-left-radius:0px;}.kadence-column6561_417581-a6 > .kt-inside-inner-col{column-gap:var(--global-kb-gap-sm, 1rem);}.kadence-column6561_417581-a6 > .kt-inside-inner-col{flex-direction:column;}.kadence-column6561_417581-a6 > .kt-inside-inner-col > .aligncenter{width:100%;}.kadence-column6561_417581-a6 > .kt-inside-inner-col:before{opacity:0.3;}.kadence-column6561_417581-a6{position:relative;}.kadence-column6561_417581-a6, .kt-inside-inner-col > .kadence-column6561_417581-a6:not(.specificity){margin-left:var(--global-kb-spacing-sm, 1.5rem);}@media all and (max-width: 1024px){.kadence-column6561_417581-a6 > .kt-inside-inner-col{flex-direction:column;justify-content:center;}}@media all and (max-width: 767px){.kadence-column6561_417581-a6 > .kt-inside-inner-col{flex-direction:column;justify-content:center;}}<\/style>\n<div class=\"wp-block-kadence-column kadence-column6561_417581-a6\"><div class=\"kt-inside-inner-col\">\n<p class=\"wp-block-paragraph\">Trees partition input space into rectangles. Empty regions inevitably exist between training-data clusters. Samples falling in these gaps are routed to the nearest leaf \u2014 closer to an extrapolation\/interpolation limit baked into the model architecture than to OOD per se.<\/p>\n<\/div><\/div>\n<\/div><\/div>\n\n\n\n<h2 class=\"wp-block-heading\">3. Why You Should Treat Them as OOD Anyway<\/h2>\n\n\n<style>.kadence-column6561_b1c8d9-a4 > .kt-inside-inner-col,.kadence-column6561_b1c8d9-a4 > .kt-inside-inner-col:before{border-top-left-radius:0px;border-top-right-radius:0px;border-bottom-right-radius:0px;border-bottom-left-radius:0px;}.kadence-column6561_b1c8d9-a4 > .kt-inside-inner-col{column-gap:var(--global-kb-gap-sm, 1rem);}.kadence-column6561_b1c8d9-a4 > .kt-inside-inner-col{flex-direction:column;}.kadence-column6561_b1c8d9-a4 > .kt-inside-inner-col > .aligncenter{width:100%;}.kadence-column6561_b1c8d9-a4 > .kt-inside-inner-col:before{opacity:0.3;}.kadence-column6561_b1c8d9-a4{position:relative;}.kadence-column6561_b1c8d9-a4, .kt-inside-inner-col > .kadence-column6561_b1c8d9-a4:not(.specificity){margin-left:var(--global-kb-spacing-sm, 1.5rem);}@media all and (max-width: 1024px){.kadence-column6561_b1c8d9-a4 > .kt-inside-inner-col{flex-direction:column;justify-content:center;}}@media all and (max-width: 767px){.kadence-column6561_b1c8d9-a4 > .kt-inside-inner-col{flex-direction:column;justify-content:center;}}<\/style>\n<div class=\"wp-block-kadence-column kadence-column6561_b1c8d9-a4\"><div class=\"kt-inside-inner-col\">\n<ul class=\"wp-block-list\">\n<li><strong>No reliability guarantee.<\/strong> Trees offer no statistical guarantee on regions unseen during training; the nearest-leaf prediction may be arbitrary.<\/li>\n\n\n\n<li><strong>Trees extrapolate poorly.<\/strong> Neural networks at least extrapolate smoothly (whether correctly or not). Trees flatline at the leaf value beyond the trained range \u2014 disastrous in regression.<\/li>\n\n\n\n<li><strong>Joint-distribution OOD really is OOD.<\/strong> Under a strict definition based on joint $P(x_1, x_2, \\ldots, x_n)$, sparse joint regions qualify as OOD even if marginals are in-distribution.<\/li>\n<\/ul>\n\n\n\n<h3 class=\"wp-block-heading\">3.1 Where It Sits Academically<\/h3>\n\n\n<style>.kadence-column6561_85c42b-84 > .kt-inside-inner-col,.kadence-column6561_85c42b-84 > .kt-inside-inner-col:before{border-top-left-radius:0px;border-top-right-radius:0px;border-bottom-right-radius:0px;border-bottom-left-radius:0px;}.kadence-column6561_85c42b-84 > .kt-inside-inner-col{column-gap:var(--global-kb-gap-sm, 1rem);}.kadence-column6561_85c42b-84 > .kt-inside-inner-col{flex-direction:column;}.kadence-column6561_85c42b-84 > .kt-inside-inner-col > .aligncenter{width:100%;}.kadence-column6561_85c42b-84 > .kt-inside-inner-col:before{opacity:0.3;}.kadence-column6561_85c42b-84{position:relative;}.kadence-column6561_85c42b-84, .kt-inside-inner-col > .kadence-column6561_85c42b-84:not(.specificity){margin-left:var(--global-kb-spacing-sm, 1.5rem);}@media all and (max-width: 1024px){.kadence-column6561_85c42b-84 > .kt-inside-inner-col{flex-direction:column;justify-content:center;}}@media all and (max-width: 767px){.kadence-column6561_85c42b-84 > .kt-inside-inner-col{flex-direction:column;justify-content:center;}}<\/style>\n<div class=\"wp-block-kadence-column kadence-column6561_85c42b-84\"><div class=\"kt-inside-inner-col\">\n<p class=\"wp-block-paragraph\">The OOD literature (Yang 2021) discusses this issue under three overlapping categories: <strong>combinatorial \/ compositional OOD<\/strong> (elements seen but combinations new), <strong>epistemic uncertainty<\/strong> (uncertainty from insufficient training in that region), and <strong>coverage gap \/ sparse region<\/strong> (a generic term for low-density regions of $P_{\\text{train}}(x)$, the typical target of density-based OOD detectors).<\/p>\n<\/div><\/div>\n<\/div><\/div>\n\n\n\n<h2 class=\"wp-block-heading\">4. Detection Techniques for Missing-Path Samples<\/h2>\n\n\n<style>.kadence-column6561_8dc13a-a8 > .kt-inside-inner-col,.kadence-column6561_8dc13a-a8 > .kt-inside-inner-col:before{border-top-left-radius:0px;border-top-right-radius:0px;border-bottom-right-radius:0px;border-bottom-left-radius:0px;}.kadence-column6561_8dc13a-a8 > .kt-inside-inner-col{column-gap:var(--global-kb-gap-sm, 1rem);}.kadence-column6561_8dc13a-a8 > .kt-inside-inner-col{flex-direction:column;}.kadence-column6561_8dc13a-a8 > .kt-inside-inner-col > .aligncenter{width:100%;}.kadence-column6561_8dc13a-a8 > .kt-inside-inner-col:before{opacity:0.3;}.kadence-column6561_8dc13a-a8{position:relative;}.kadence-column6561_8dc13a-a8, .kt-inside-inner-col > .kadence-column6561_8dc13a-a8:not(.specificity){margin-left:var(--global-kb-spacing-sm, 1.5rem);}@media all and (max-width: 1024px){.kadence-column6561_8dc13a-a8 > .kt-inside-inner-col{flex-direction:column;justify-content:center;}}@media all and (max-width: 767px){.kadence-column6561_8dc13a-a8 > .kt-inside-inner-col{flex-direction:column;justify-content:center;}}<\/style>\n<div class=\"wp-block-kadence-column kadence-column6561_8dc13a-a8\"><div class=\"kt-inside-inner-col\">\n<h3 class=\"wp-block-heading\">4.0 Bridging Detection and Mitigation<\/h3>\n\n\n<style>.kadence-column6561_20a82e-4a > .kt-inside-inner-col,.kadence-column6561_20a82e-4a > .kt-inside-inner-col:before{border-top-left-radius:0px;border-top-right-radius:0px;border-bottom-right-radius:0px;border-bottom-left-radius:0px;}.kadence-column6561_20a82e-4a > .kt-inside-inner-col{column-gap:var(--global-kb-gap-sm, 1rem);}.kadence-column6561_20a82e-4a > .kt-inside-inner-col{flex-direction:column;}.kadence-column6561_20a82e-4a > .kt-inside-inner-col > .aligncenter{width:100%;}.kadence-column6561_20a82e-4a > .kt-inside-inner-col:before{opacity:0.3;}.kadence-column6561_20a82e-4a{position:relative;}.kadence-column6561_20a82e-4a, .kt-inside-inner-col > .kadence-column6561_20a82e-4a:not(.specificity){margin-left:var(--global-kb-spacing-sm, 1.5rem);}@media all and (max-width: 1024px){.kadence-column6561_20a82e-4a > .kt-inside-inner-col{flex-direction:column;justify-content:center;}}@media all and (max-width: 767px){.kadence-column6561_20a82e-4a > .kt-inside-inner-col{flex-direction:column;justify-content:center;}}<\/style>\n<div class=\"wp-block-kadence-column kadence-column6561_20a82e-4a\"><div class=\"kt-inside-inner-col\">\n<p class=\"wp-block-paragraph\">Before jumping to mitigations, the system must first answer: <strong>&#8220;is this sample a missing-path risk?&#8221;<\/strong> Applying every safeguard to every sample wastes accuracy and throughput.<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">The detection stage outputs a <strong>confidence score or OOD score<\/strong> per sample \u2014 a continuous number quantifying &#8220;is this sample in a well-trained region?&#8221; Higher = safer (in-distribution); lower = riskier (missing-path candidate).<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">This score then feeds directly into the mitigations of Section 5. High score \u2192 trust the model and serve the prediction. Low score \u2192 reject, fall back to a more conservative model, return a wider Prediction Interval (PI), or escalate to a human.<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">Detection outputs are therefore not boolean but <strong>continuous score plus threshold<\/strong>. Thresholds are calibrated on training or holdout data (e.g., the 5th percentile). Each technique below provides a <strong>numeric good \/ borderline \/ bad rule<\/strong>; concrete cutoffs are dataset-dependent and should be calibrated empirically.<\/p>\n<\/div><\/div>\n\n\n\n<h3 class=\"wp-block-heading\">4.1 Leaf-Based Confidence<\/h3>\n\n\n<style>.kadence-column6561_5b828f-30 > .kt-inside-inner-col,.kadence-column6561_5b828f-30 > .kt-inside-inner-col:before{border-top-left-radius:0px;border-top-right-radius:0px;border-bottom-right-radius:0px;border-bottom-left-radius:0px;}.kadence-column6561_5b828f-30 > .kt-inside-inner-col{column-gap:var(--global-kb-gap-sm, 1rem);}.kadence-column6561_5b828f-30 > .kt-inside-inner-col{flex-direction:column;}.kadence-column6561_5b828f-30 > .kt-inside-inner-col > .aligncenter{width:100%;}.kadence-column6561_5b828f-30 > .kt-inside-inner-col:before{opacity:0.3;}.kadence-column6561_5b828f-30{position:relative;}.kadence-column6561_5b828f-30, .kt-inside-inner-col > .kadence-column6561_5b828f-30:not(.specificity){margin-left:var(--global-kb-spacing-sm, 1.5rem);}@media all and (max-width: 1024px){.kadence-column6561_5b828f-30 > .kt-inside-inner-col{flex-direction:column;justify-content:center;}}@media all and (max-width: 767px){.kadence-column6561_5b828f-30 > .kt-inside-inner-col{flex-direction:column;justify-content:center;}}<\/style>\n<div class=\"wp-block-kadence-column kadence-column6561_5b828f-30\"><div class=\"kt-inside-inner-col\">\n<p class=\"wp-block-paragraph\"><strong>Principle.<\/strong> Check how many training samples ended up in the leaf the query sample lands in. Sparsely populated leaves indicate undertrained regions. The technique requires no extra model \u2014 the tree structure itself supplies the confidence signal.<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">Algorithm: (1) push training data through the model and record which leaf each sample reaches in each tree, then count training samples per leaf; (2) at inference, average (or take the minimum of) those counts over all trees for the query sample. All three libraries expose leaf indices: XGBoost via <code>predict(..., pred_leaf=True)<\/code>, LightGBM via the same option, and CatBoost via <code>calc_leaf_indexes()<\/code>.<\/p>\n\n\n\n<p class=\"wp-block-paragraph\"><strong>Numeric thresholds (good \/ borderline \/ bad).<\/strong><\/p>\n\n\n\n<ul class=\"wp-block-list\">\n<li><strong>Good (in-distribution):<\/strong> score \u2265 25th percentile of training scores. Typically &gt; 20 training samples per leaf on average.<\/li>\n\n\n\n<li><strong>Borderline:<\/strong> between the 5th and 25th percentile. About 5 \u2013 20 samples per leaf.<\/li>\n\n\n\n<li><strong>Bad (missing-path risk):<\/strong> &lt; 5th percentile. Fewer than 5 samples per leaf, or a leaf with only 1 \u2013 2 training samples in some trees.<\/li>\n<\/ul>\n\n\n\n<div class=\"wp-block-kevinbatdorf-code-block-pro cbp-has-line-numbers\" data-code-block-pro-font-family=\"Code-Pro-JetBrains-Mono\" style=\"font-size:1rem;font-family:Code-Pro-JetBrains-Mono,ui-monospace,SFMono-Regular,Menlo,Monaco,Consolas,monospace;--cbp-line-number-color:#24292e;--cbp-line-number-width:calc(2 * 0.6 * 1rem);line-height:1.625rem;--cbp-tab-width:2;tab-size:var(--cbp-tab-width, 2)\"><span role=\"button\" tabindex=\"0\" style=\"color:#24292e;display:none\" aria-label=\"Copy\" class=\"code-block-pro-copy-button\"><pre class=\"code-block-pro-copy-button-pre\" aria-hidden=\"true\"><textarea class=\"code-block-pro-copy-button-textarea\" tabindex=\"-1\" aria-hidden=\"true\" readonly>import numpy as np\nimport xgboost as xgb\nimport lightgbm as lgb\nfrom catboost import CatBoostRegressor\nfrom sklearn.datasets import make_regression\nfrom sklearn.model_selection import train_test_split\nfrom collections import Counter\n\n\ndef leaf_coverage_scores(model, X_train, X_query, library=\"xgboost\"):\n    \"\"\"\n    Compute leaf-coverage-based confidence scores.\n    Returns both mean and minimum across trees.\n    \"\"\"\n    if library == \"xgboost\":\n        import xgboost as xgb\n        train_leaves = model.predict(xgb.DMatrix(X_train), pred_leaf=True)\n        query_leaves = model.predict(xgb.DMatrix(X_query), pred_leaf=True)\n    elif library == \"lightgbm\":\n        train_leaves = model.predict(X_train, pred_leaf=True)\n        query_leaves = model.predict(X_query, pred_leaf=True)\n    elif library == \"catboost\":\n        train_leaves = model.calc_leaf_indexes(X_train)\n        query_leaves = model.calc_leaf_indexes(X_query)\n\n    n_trees = train_leaves.shape&#091;1&#093;\n    n_query = len(X_query)\n\n    # Number of training samples reaching each leaf, per tree\n    leaf_counts = [Counter(train_leaves&#091;:, t&#093;) for t in range(n_trees)]\n\n    # Matrix of per-tree leaf training counts for each query sample\n    counts_matrix = np.zeros((n_query, n_trees))\n    for t in range(n_trees):\n        for i in range(n_query):\n            counts_matrix&#091;i, t&#093; = leaf_counts&#091;t&#093;.get(query_leaves&#091;i, t&#093;, 0)\n\n    return {\n        \"mean\": counts_matrix.mean(axis=1),    # Mean: standard confidence signal\n        \"min\": counts_matrix.min(axis=1),      # Min: most conservative signal\n        \"median\": np.median(counts_matrix, axis=1),  # Median: robust to outliers\n    }\n\ndef classify_by_leaf_coverage(score, train_scores):\n    p5 = np.quantile(train_scores, 0.05)\n    p25 = np.quantile(train_scores, 0.25)\n    if score >= p25:\n        return \"good\"\n    elif score >= p5:\n        return \"borderline\"\n    else:\n        return \"bad\"\n\n\n# Example\nX, y = make_regression(n_samples=1000, n_features=5, noise=0.1, random_state=42)\nX_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)\n\ndtrain = xgb.DMatrix(X_train, label=y_train)\nxgb_model = xgb.train(\n    {\"objective\": \"reg:squarederror\", \"max_depth\": 4},\n    dtrain,\n    num_boost_round=50,\n)\n\ntrain_scores = leaf_coverage_scores(xgb_model, X_train, X_train)\nood_sample = X_test&#091;:5&#093; + 5.0\nquery_scores = leaf_coverage_scores(xgb_model, X_train, ood_sample)\n\nfor s in query_scores:\n    label = classify_by_leaf_coverage(s, train_scores)\n    print(f\"score={s:.2f} -> {label}\")<\/textarea><\/pre><svg xmlns=\"http:\/\/www.w3.org\/2000\/svg\" style=\"width:24px;height:24px\" fill=\"none\" viewBox=\"0 0 24 24\" stroke=\"currentColor\" stroke-width=\"2\"><path class=\"with-check\" stroke-linecap=\"round\" stroke-linejoin=\"round\" d=\"M9 5H7a2 2 0 00-2 2v12a2 2 0 002 2h10a2 2 0 002-2V7a2 2 0 00-2-2h-2M9 5a2 2 0 002 2h2a2 2 0 002-2M9 5a2 2 0 012-2h2a2 2 0 012 2m-6 9l2 2 4-4\"><\/path><path class=\"without-check\" stroke-linecap=\"round\" stroke-linejoin=\"round\" d=\"M9 5H7a2 2 0 00-2 2v12a2 2 0 002 2h10a2 2 0 002-2V7a2 2 0 00-2-2h-2M9 5a2 2 0 002 2h2a2 2 0 002-2M9 5a2 2 0 012-2h2a2 2 0 012 2\"><\/path><\/svg><\/span><pre class=\"shiki github-light\" style=\"background-color: #fff\" tabindex=\"0\"><code><span class=\"line\"><span style=\"color: #D73A49\">import<\/span><span style=\"color: #24292E\"> numpy <\/span><span style=\"color: #D73A49\">as<\/span><span style=\"color: #24292E\"> np<\/span><\/span>\n<span class=\"line\"><span style=\"color: #D73A49\">import<\/span><span style=\"color: #24292E\"> xgboost <\/span><span style=\"color: #D73A49\">as<\/span><span style=\"color: #24292E\"> xgb<\/span><\/span>\n<span class=\"line\"><span style=\"color: #D73A49\">import<\/span><span style=\"color: #24292E\"> lightgbm <\/span><span style=\"color: #D73A49\">as<\/span><span style=\"color: #24292E\"> lgb<\/span><\/span>\n<span class=\"line\"><span style=\"color: #D73A49\">from<\/span><span style=\"color: #24292E\"> catboost <\/span><span style=\"color: #D73A49\">import<\/span><span style=\"color: #24292E\"> CatBoostRegressor<\/span><\/span>\n<span class=\"line\"><span style=\"color: #D73A49\">from<\/span><span style=\"color: #24292E\"> sklearn.datasets <\/span><span style=\"color: #D73A49\">import<\/span><span style=\"color: #24292E\"> make_regression<\/span><\/span>\n<span class=\"line\"><span style=\"color: #D73A49\">from<\/span><span style=\"color: #24292E\"> sklearn.model_selection <\/span><span style=\"color: #D73A49\">import<\/span><span style=\"color: #24292E\"> train_test_split<\/span><\/span>\n<span class=\"line\"><span style=\"color: #D73A49\">from<\/span><span style=\"color: #24292E\"> collections <\/span><span style=\"color: #D73A49\">import<\/span><span style=\"color: #24292E\"> Counter<\/span><\/span>\n<span class=\"line\"><\/span>\n<span class=\"line\"><\/span>\n<span class=\"line\"><span style=\"color: #D73A49\">def<\/span><span style=\"color: #24292E\"> <\/span><span style=\"color: #6F42C1\">leaf_coverage_scores<\/span><span style=\"color: #24292E\">(model, X_train, X_query, library<\/span><span style=\"color: #D73A49\">=<\/span><span style=\"color: #032F62\">&quot;xgboost&quot;<\/span><span style=\"color: #24292E\">):<\/span><\/span>\n<span class=\"line\"><span style=\"color: #24292E\">    <\/span><span style=\"color: #032F62\">&quot;&quot;&quot;<\/span><\/span>\n<span class=\"line\"><span style=\"color: #032F62\">    Compute leaf-coverage-based confidence scores.<\/span><\/span>\n<span class=\"line\"><span style=\"color: #032F62\">    Returns both mean and minimum across trees.<\/span><\/span>\n<span class=\"line\"><span style=\"color: #032F62\">    &quot;&quot;&quot;<\/span><\/span>\n<span class=\"line\"><span style=\"color: #24292E\">    <\/span><span style=\"color: #D73A49\">if<\/span><span style=\"color: #24292E\"> library <\/span><span style=\"color: #D73A49\">==<\/span><span style=\"color: #24292E\"> <\/span><span style=\"color: #032F62\">&quot;xgboost&quot;<\/span><span style=\"color: #24292E\">:<\/span><\/span>\n<span class=\"line\"><span style=\"color: #24292E\">        <\/span><span style=\"color: #D73A49\">import<\/span><span style=\"color: #24292E\"> xgboost <\/span><span style=\"color: #D73A49\">as<\/span><span style=\"color: #24292E\"> xgb<\/span><\/span>\n<span class=\"line\"><span style=\"color: #24292E\">        train_leaves <\/span><span style=\"color: #D73A49\">=<\/span><span style=\"color: #24292E\"> model.predict(xgb.DMatrix(X_train), <\/span><span style=\"color: #E36209\">pred_leaf<\/span><span style=\"color: #D73A49\">=<\/span><span style=\"color: #005CC5\">True<\/span><span style=\"color: #24292E\">)<\/span><\/span>\n<span class=\"line\"><span style=\"color: #24292E\">        query_leaves <\/span><span style=\"color: #D73A49\">=<\/span><span style=\"color: #24292E\"> model.predict(xgb.DMatrix(X_query), <\/span><span style=\"color: #E36209\">pred_leaf<\/span><span style=\"color: #D73A49\">=<\/span><span style=\"color: #005CC5\">True<\/span><span style=\"color: #24292E\">)<\/span><\/span>\n<span class=\"line\"><span style=\"color: #24292E\">    <\/span><span style=\"color: #D73A49\">elif<\/span><span style=\"color: #24292E\"> library <\/span><span style=\"color: #D73A49\">==<\/span><span style=\"color: #24292E\"> <\/span><span style=\"color: #032F62\">&quot;lightgbm&quot;<\/span><span style=\"color: #24292E\">:<\/span><\/span>\n<span class=\"line\"><span style=\"color: #24292E\">        train_leaves <\/span><span style=\"color: #D73A49\">=<\/span><span style=\"color: #24292E\"> model.predict(X_train, <\/span><span style=\"color: #E36209\">pred_leaf<\/span><span style=\"color: #D73A49\">=<\/span><span style=\"color: #005CC5\">True<\/span><span style=\"color: #24292E\">)<\/span><\/span>\n<span class=\"line\"><span style=\"color: #24292E\">        query_leaves <\/span><span style=\"color: #D73A49\">=<\/span><span style=\"color: #24292E\"> model.predict(X_query, <\/span><span style=\"color: #E36209\">pred_leaf<\/span><span style=\"color: #D73A49\">=<\/span><span style=\"color: #005CC5\">True<\/span><span style=\"color: #24292E\">)<\/span><\/span>\n<span class=\"line\"><span style=\"color: #24292E\">    <\/span><span style=\"color: #D73A49\">elif<\/span><span style=\"color: #24292E\"> library <\/span><span style=\"color: #D73A49\">==<\/span><span style=\"color: #24292E\"> <\/span><span style=\"color: #032F62\">&quot;catboost&quot;<\/span><span style=\"color: #24292E\">:<\/span><\/span>\n<span class=\"line\"><span style=\"color: #24292E\">        train_leaves <\/span><span style=\"color: #D73A49\">=<\/span><span style=\"color: #24292E\"> model.calc_leaf_indexes(X_train)<\/span><\/span>\n<span class=\"line\"><span style=\"color: #24292E\">        query_leaves <\/span><span style=\"color: #D73A49\">=<\/span><span style=\"color: #24292E\"> model.calc_leaf_indexes(X_query)<\/span><\/span>\n<span class=\"line\"><\/span>\n<span class=\"line\"><span style=\"color: #24292E\">    n_trees <\/span><span style=\"color: #D73A49\">=<\/span><span style=\"color: #24292E\"> train_leaves.shape&#091;<\/span><span style=\"color: #005CC5\">1<\/span><span style=\"color: #24292E\">&#093;<\/span><\/span>\n<span class=\"line\"><span style=\"color: #24292E\">    n_query <\/span><span style=\"color: #D73A49\">=<\/span><span style=\"color: #24292E\"> <\/span><span style=\"color: #005CC5\">len<\/span><span style=\"color: #24292E\">(X_query)<\/span><\/span>\n<span class=\"line\"><\/span>\n<span class=\"line\"><span style=\"color: #24292E\">    <\/span><span style=\"color: #6A737D\"># Number of training samples reaching each leaf, per tree<\/span><\/span>\n<span class=\"line\"><span style=\"color: #24292E\">    leaf_counts <\/span><span style=\"color: #D73A49\">=<\/span><span style=\"color: #24292E\"> [Counter(train_leaves&#091;:, t&#093;) <\/span><span style=\"color: #D73A49\">for<\/span><span style=\"color: #24292E\"> t <\/span><span style=\"color: #D73A49\">in<\/span><span style=\"color: #24292E\"> <\/span><span style=\"color: #005CC5\">range<\/span><span style=\"color: #24292E\">(n_trees)]<\/span><\/span>\n<span class=\"line\"><\/span>\n<span class=\"line\"><span style=\"color: #24292E\">    <\/span><span style=\"color: #6A737D\"># Matrix of per-tree leaf training counts for each query sample<\/span><\/span>\n<span class=\"line\"><span style=\"color: #24292E\">    counts_matrix <\/span><span style=\"color: #D73A49\">=<\/span><span style=\"color: #24292E\"> np.zeros((n_query, n_trees))<\/span><\/span>\n<span class=\"line\"><span style=\"color: #24292E\">    <\/span><span style=\"color: #D73A49\">for<\/span><span style=\"color: #24292E\"> t <\/span><span style=\"color: #D73A49\">in<\/span><span style=\"color: #24292E\"> <\/span><span style=\"color: #005CC5\">range<\/span><span style=\"color: #24292E\">(n_trees):<\/span><\/span>\n<span class=\"line\"><span style=\"color: #24292E\">        <\/span><span style=\"color: #D73A49\">for<\/span><span style=\"color: #24292E\"> i <\/span><span style=\"color: #D73A49\">in<\/span><span style=\"color: #24292E\"> <\/span><span style=\"color: #005CC5\">range<\/span><span style=\"color: #24292E\">(n_query):<\/span><\/span>\n<span class=\"line\"><span style=\"color: #24292E\">            counts_matrix&#091;i, t&#093; <\/span><span style=\"color: #D73A49\">=<\/span><span style=\"color: #24292E\"> leaf_counts&#091;t&#093;.get(query_leaves&#091;i, t&#093;, <\/span><span style=\"color: #005CC5\">0<\/span><span style=\"color: #24292E\">)<\/span><\/span>\n<span class=\"line\"><\/span>\n<span class=\"line\"><span style=\"color: #24292E\">    <\/span><span style=\"color: #D73A49\">return<\/span><span style=\"color: #24292E\"> {<\/span><\/span>\n<span class=\"line\"><span style=\"color: #24292E\">        <\/span><span style=\"color: #032F62\">&quot;mean&quot;<\/span><span style=\"color: #24292E\">: counts_matrix.mean(<\/span><span style=\"color: #E36209\">axis<\/span><span style=\"color: #D73A49\">=<\/span><span style=\"color: #005CC5\">1<\/span><span style=\"color: #24292E\">),    <\/span><span style=\"color: #6A737D\"># Mean: standard confidence signal<\/span><\/span>\n<span class=\"line\"><span style=\"color: #24292E\">        <\/span><span style=\"color: #032F62\">&quot;min&quot;<\/span><span style=\"color: #24292E\">: counts_matrix.min(<\/span><span style=\"color: #E36209\">axis<\/span><span style=\"color: #D73A49\">=<\/span><span style=\"color: #005CC5\">1<\/span><span style=\"color: #24292E\">),      <\/span><span style=\"color: #6A737D\"># Min: most conservative signal<\/span><\/span>\n<span class=\"line\"><span style=\"color: #24292E\">        <\/span><span style=\"color: #032F62\">&quot;median&quot;<\/span><span style=\"color: #24292E\">: np.median(counts_matrix, <\/span><span style=\"color: #E36209\">axis<\/span><span style=\"color: #D73A49\">=<\/span><span style=\"color: #005CC5\">1<\/span><span style=\"color: #24292E\">),  <\/span><span style=\"color: #6A737D\"># Median: robust to outliers<\/span><\/span>\n<span class=\"line\"><span style=\"color: #24292E\">    }<\/span><\/span>\n<span class=\"line\"><\/span>\n<span class=\"line\"><span style=\"color: #D73A49\">def<\/span><span style=\"color: #24292E\"> <\/span><span style=\"color: #6F42C1\">classify_by_leaf_coverage<\/span><span style=\"color: #24292E\">(score, train_scores):<\/span><\/span>\n<span class=\"line\"><span style=\"color: #24292E\">    p5 <\/span><span style=\"color: #D73A49\">=<\/span><span style=\"color: #24292E\"> np.quantile(train_scores, <\/span><span style=\"color: #005CC5\">0.05<\/span><span style=\"color: #24292E\">)<\/span><\/span>\n<span class=\"line\"><span style=\"color: #24292E\">    p25 <\/span><span style=\"color: #D73A49\">=<\/span><span style=\"color: #24292E\"> np.quantile(train_scores, <\/span><span style=\"color: #005CC5\">0.25<\/span><span style=\"color: #24292E\">)<\/span><\/span>\n<span class=\"line\"><span style=\"color: #24292E\">    <\/span><span style=\"color: #D73A49\">if<\/span><span style=\"color: #24292E\"> score <\/span><span style=\"color: #D73A49\">&gt;=<\/span><span style=\"color: #24292E\"> p25:<\/span><\/span>\n<span class=\"line\"><span style=\"color: #24292E\">        <\/span><span style=\"color: #D73A49\">return<\/span><span style=\"color: #24292E\"> <\/span><span style=\"color: #032F62\">&quot;good&quot;<\/span><\/span>\n<span class=\"line\"><span style=\"color: #24292E\">    <\/span><span style=\"color: #D73A49\">elif<\/span><span style=\"color: #24292E\"> score <\/span><span style=\"color: #D73A49\">&gt;=<\/span><span style=\"color: #24292E\"> p5:<\/span><\/span>\n<span class=\"line\"><span style=\"color: #24292E\">        <\/span><span style=\"color: #D73A49\">return<\/span><span style=\"color: #24292E\"> <\/span><span style=\"color: #032F62\">&quot;borderline&quot;<\/span><\/span>\n<span class=\"line\"><span style=\"color: #24292E\">    <\/span><span style=\"color: #D73A49\">else<\/span><span style=\"color: #24292E\">:<\/span><\/span>\n<span class=\"line\"><span style=\"color: #24292E\">        <\/span><span style=\"color: #D73A49\">return<\/span><span style=\"color: #24292E\"> <\/span><span style=\"color: #032F62\">&quot;bad&quot;<\/span><\/span>\n<span class=\"line\"><\/span>\n<span class=\"line\"><\/span>\n<span class=\"line\"><span style=\"color: #6A737D\"># Example<\/span><\/span>\n<span class=\"line\"><span style=\"color: #24292E\">X, y <\/span><span style=\"color: #D73A49\">=<\/span><span style=\"color: #24292E\"> make_regression(<\/span><span style=\"color: #E36209\">n_samples<\/span><span style=\"color: #D73A49\">=<\/span><span style=\"color: #005CC5\">1000<\/span><span style=\"color: #24292E\">, <\/span><span style=\"color: #E36209\">n_features<\/span><span style=\"color: #D73A49\">=<\/span><span style=\"color: #005CC5\">5<\/span><span style=\"color: #24292E\">, <\/span><span style=\"color: #E36209\">noise<\/span><span style=\"color: #D73A49\">=<\/span><span style=\"color: #005CC5\">0.1<\/span><span style=\"color: #24292E\">, <\/span><span style=\"color: #E36209\">random_state<\/span><span style=\"color: #D73A49\">=<\/span><span style=\"color: #005CC5\">42<\/span><span style=\"color: #24292E\">)<\/span><\/span>\n<span class=\"line\"><span style=\"color: #24292E\">X_train, X_test, y_train, y_test <\/span><span style=\"color: #D73A49\">=<\/span><span style=\"color: #24292E\"> train_test_split(X, y, <\/span><span style=\"color: #E36209\">test_size<\/span><span style=\"color: #D73A49\">=<\/span><span style=\"color: #005CC5\">0.2<\/span><span style=\"color: #24292E\">, <\/span><span style=\"color: #E36209\">random_state<\/span><span style=\"color: #D73A49\">=<\/span><span style=\"color: #005CC5\">42<\/span><span style=\"color: #24292E\">)<\/span><\/span>\n<span class=\"line\"><\/span>\n<span class=\"line\"><span style=\"color: #24292E\">dtrain <\/span><span style=\"color: #D73A49\">=<\/span><span style=\"color: #24292E\"> xgb.DMatrix(X_train, <\/span><span style=\"color: #E36209\">label<\/span><span style=\"color: #D73A49\">=<\/span><span style=\"color: #24292E\">y_train)<\/span><\/span>\n<span class=\"line\"><span style=\"color: #24292E\">xgb_model <\/span><span style=\"color: #D73A49\">=<\/span><span style=\"color: #24292E\"> xgb.train(<\/span><\/span>\n<span class=\"line\"><span style=\"color: #24292E\">    {<\/span><span style=\"color: #032F62\">&quot;objective&quot;<\/span><span style=\"color: #24292E\">: <\/span><span style=\"color: #032F62\">&quot;reg:squarederror&quot;<\/span><span style=\"color: #24292E\">, <\/span><span style=\"color: #032F62\">&quot;max_depth&quot;<\/span><span style=\"color: #24292E\">: <\/span><span style=\"color: #005CC5\">4<\/span><span style=\"color: #24292E\">},<\/span><\/span>\n<span class=\"line\"><span style=\"color: #24292E\">    dtrain,<\/span><\/span>\n<span class=\"line\"><span style=\"color: #24292E\">    <\/span><span style=\"color: #E36209\">num_boost_round<\/span><span style=\"color: #D73A49\">=<\/span><span style=\"color: #005CC5\">50<\/span><span style=\"color: #24292E\">,<\/span><\/span>\n<span class=\"line\"><span style=\"color: #24292E\">)<\/span><\/span>\n<span class=\"line\"><\/span>\n<span class=\"line\"><span style=\"color: #24292E\">train_scores <\/span><span style=\"color: #D73A49\">=<\/span><span style=\"color: #24292E\"> leaf_coverage_scores(xgb_model, X_train, X_train)<\/span><\/span>\n<span class=\"line\"><span style=\"color: #24292E\">ood_sample <\/span><span style=\"color: #D73A49\">=<\/span><span style=\"color: #24292E\"> X_test&#091;:<\/span><span style=\"color: #005CC5\">5<\/span><span style=\"color: #24292E\">&#093; <\/span><span style=\"color: #D73A49\">+<\/span><span style=\"color: #24292E\"> <\/span><span style=\"color: #005CC5\">5.0<\/span><\/span>\n<span class=\"line\"><span style=\"color: #24292E\">query_scores <\/span><span style=\"color: #D73A49\">=<\/span><span style=\"color: #24292E\"> leaf_coverage_scores(xgb_model, X_train, ood_sample)<\/span><\/span>\n<span class=\"line\"><\/span>\n<span class=\"line\"><span style=\"color: #D73A49\">for<\/span><span style=\"color: #24292E\"> s <\/span><span style=\"color: #D73A49\">in<\/span><span style=\"color: #24292E\"> query_scores:<\/span><\/span>\n<span class=\"line\"><span style=\"color: #24292E\">    label <\/span><span style=\"color: #D73A49\">=<\/span><span style=\"color: #24292E\"> classify_by_leaf_coverage(s, train_scores)<\/span><\/span>\n<span class=\"line\"><span style=\"color: #24292E\">    <\/span><span style=\"color: #005CC5\">print<\/span><span style=\"color: #24292E\">(<\/span><span style=\"color: #D73A49\">f<\/span><span style=\"color: #032F62\">&quot;score=<\/span><span style=\"color: #005CC5\">{<\/span><span style=\"color: #24292E\">s<\/span><span style=\"color: #D73A49\">:.2f<\/span><span style=\"color: #005CC5\">}<\/span><span style=\"color: #032F62\"> -&gt; <\/span><span style=\"color: #005CC5\">{<\/span><span style=\"color: #24292E\">label<\/span><span style=\"color: #005CC5\">}<\/span><span style=\"color: #032F62\">&quot;<\/span><span style=\"color: #24292E\">)<\/span><\/span><\/code><\/pre><\/div>\n\n\n\n<p class=\"wp-block-paragraph\"><strong>Use in Section 5.<\/strong> Feeds directly into <em>5.1 Reject option<\/em> and <em>5.2 Hybrid fallback<\/em>: samples below the 5th percentile are auto-rejected or routed to a linear fallback.<\/p>\n<\/div><\/div>\n\n\n\n<h3 class=\"wp-block-heading\">4.2 Forest \/ Ensemble Variance<\/h3>\n\n\n<style>.kadence-column6561_6e2066-87 > .kt-inside-inner-col,.kadence-column6561_6e2066-87 > .kt-inside-inner-col:before{border-top-left-radius:0px;border-top-right-radius:0px;border-bottom-right-radius:0px;border-bottom-left-radius:0px;}.kadence-column6561_6e2066-87 > .kt-inside-inner-col{column-gap:var(--global-kb-gap-sm, 1rem);}.kadence-column6561_6e2066-87 > .kt-inside-inner-col{flex-direction:column;}.kadence-column6561_6e2066-87 > .kt-inside-inner-col > .aligncenter{width:100%;}.kadence-column6561_6e2066-87 > .kt-inside-inner-col:before{opacity:0.3;}.kadence-column6561_6e2066-87{position:relative;}.kadence-column6561_6e2066-87, .kt-inside-inner-col > .kadence-column6561_6e2066-87:not(.specificity){margin-left:var(--global-kb-spacing-sm, 1.5rem);}@media all and (max-width: 1024px){.kadence-column6561_6e2066-87 > .kt-inside-inner-col{flex-direction:column;justify-content:center;}}@media all and (max-width: 767px){.kadence-column6561_6e2066-87 > .kt-inside-inner-col{flex-direction:column;justify-content:center;}}<\/style>\n<div class=\"wp-block-kadence-column kadence-column6561_6e2066-87\"><div class=\"kt-inside-inner-col\">\n<p class=\"wp-block-paragraph\"><strong>Principle.<\/strong> Use the spread of predictions across trees (or across an ensemble of models) as epistemic uncertainty. <strong>Disagreement between trees is a strong signal that a region was poorly covered during training.<\/strong> In dense regions, trees converge; in sparse regions, splitting decisions diverge.<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">Rooted in the bias-variance decomposition: random forest variance is small in trained regions and large in unseen regions. Boosting models can imitate this with a multi-seed ensemble. Notable methods include Quantile Regression Forest (Meinshausen 2006), and the built-in <code>objective=\"quantile\"<\/code> in LightGBM\/XGBoost which yields a Prediction Interval (PI) from a single model.<\/p>\n\n\n\n<p class=\"wp-block-paragraph\"><strong>Numeric thresholds.<\/strong> The output is PI width or ensemble standard deviation.<\/p>\n\n\n\n<ul class=\"wp-block-list\">\n<li><strong>Good:<\/strong> PI width \u2264 1.5 \u00d7 median PI width on training data, or ensemble standard deviation \u2264 75th percentile.<\/li>\n\n\n\n<li><strong>Borderline:<\/strong> 1.5 \u2013 3 \u00d7 median PI width, or std at 75th \u2013 95th percentile.<\/li>\n\n\n\n<li><strong>Bad:<\/strong> &gt; 3 \u00d7 median PI width, or std &gt; 95th percentile. PI width often inflates 5 \u2013 10\u00d7 in unseen regions.<\/li>\n<\/ul>\n\n\n<div class=\"wp-block-syntaxhighlighter-code \"><pre class=\"brush: python; title: ; notranslate\" title=\"\">\nimport numpy as np\nimport lightgbm as lgb\nfrom sklearn.ensemble import RandomForestRegressor\nfrom sklearn.datasets import make_regression\nfrom sklearn.model_selection import train_test_split\n\n\ndef quantile_forest_uncertainty(X_train, y_train, X_query, quantiles=(0.05, 0.95)):\n    &quot;&quot;&quot;Quantile PI from RF leaf distributions.&quot;&quot;&quot;\n    rf = RandomForestRegressor(n_estimators=100, random_state=42)\n    rf.fit(X_train, y_train)\n    train_leaves = rf.apply(X_train)\n    query_leaves = rf.apply(X_query)\n\n    intervals = &#x5B;]\n    for i in range(len(X_query)):\n        y_collected = &#x5B;]\n        for t in range(rf.n_estimators):\n            mask = train_leaves&#x5B;:, t] == query_leaves&#x5B;i, t]\n            y_collected.extend(y_train&#x5B;mask])\n        if len(y_collected) &gt; 0:\n            lower = np.quantile(y_collected, quantiles&#x5B;0])\n            upper = np.quantile(y_collected, quantiles&#x5B;1])\n            intervals.append((lower, upper, upper - lower))\n        else:\n            intervals.append((np.nan, np.nan, np.inf))\n    return np.array(intervals)\n\n\ndef lgb_quantile_uncertainty(X_train, y_train, X_query, alpha=0.1):\n    &quot;&quot;&quot;PI via LightGBM quantile regression.&quot;&quot;&quot;\n    model_lower = lgb.LGBMRegressor(\n        objective=&quot;quantile&quot;, alpha=alpha \/ 2, n_estimators=200, verbose=-1\n    )\n    model_lower.fit(X_train, y_train)\n    model_upper = lgb.LGBMRegressor(\n        objective=&quot;quantile&quot;, alpha=1 - alpha \/ 2, n_estimators=200, verbose=-1\n    )\n    model_upper.fit(X_train, y_train)\n    lower = model_lower.predict(X_query)\n    upper = model_upper.predict(X_query)\n    return lower, upper, upper - lower\n\n\ndef xgb_ensemble_variance(X_train, y_train, X_query, n_seeds=10):\n    &quot;&quot;&quot;Multi-seed XGBoost ensemble variance.&quot;&quot;&quot;\n    import xgboost as xgb\n    preds = &#x5B;]\n    for seed in range(n_seeds):\n        model = xgb.XGBRegressor(\n            n_estimators=100, max_depth=4, random_state=seed,\n            subsample=0.8, colsample_bytree=0.8,\n        )\n        model.fit(X_train, y_train)\n        preds.append(model.predict(X_query))\n    preds = np.stack(preds, axis=0)\n    return preds.mean(axis=0), preds.var(axis=0)\n\n\ndef classify_by_pi_width(width, train_widths):\n    median = np.median(train_widths)\n    if width &lt;= median * 1.5:\n        return &quot;good&quot;\n    elif width &lt;= median * 3.0:\n        return &quot;borderline&quot;\n    else:\n        return &quot;bad&quot;\n\n<\/pre><\/div>\n\n\n<p class=\"wp-block-paragraph\"><strong>Use in Section 5.<\/strong> The PI itself is the deliverable for <em>5.3 Wider PI<\/em>; PI width also feeds <em>5.1 Reject option<\/em>.<\/p>\n<\/div><\/div>\n\n\n\n<h3 class=\"wp-block-heading\">4.3 Density-Based Auxiliary OOD Detector<\/h3>\n\n\n<style>.kadence-column6561_f304d5-79 > .kt-inside-inner-col,.kadence-column6561_f304d5-79 > .kt-inside-inner-col:before{border-top-left-radius:0px;border-top-right-radius:0px;border-bottom-right-radius:0px;border-bottom-left-radius:0px;}.kadence-column6561_f304d5-79 > .kt-inside-inner-col{column-gap:var(--global-kb-gap-sm, 1rem);}.kadence-column6561_f304d5-79 > .kt-inside-inner-col{flex-direction:column;}.kadence-column6561_f304d5-79 > .kt-inside-inner-col > .aligncenter{width:100%;}.kadence-column6561_f304d5-79 > .kt-inside-inner-col:before{opacity:0.3;}.kadence-column6561_f304d5-79{position:relative;}.kadence-column6561_f304d5-79, .kt-inside-inner-col > .kadence-column6561_f304d5-79:not(.specificity){margin-left:var(--global-kb-spacing-sm, 1.5rem);}@media all and (max-width: 1024px){.kadence-column6561_f304d5-79 > .kt-inside-inner-col{flex-direction:column;justify-content:center;}}@media all and (max-width: 767px){.kadence-column6561_f304d5-79 > .kt-inside-inner-col{flex-direction:column;justify-content:center;}}<\/style>\n<div class=\"wp-block-kadence-column kadence-column6561_f304d5-79\"><div class=\"kt-inside-inner-col\">\n<p class=\"wp-block-paragraph\"><strong>Principle.<\/strong> Train a separate model that estimates the density $P_{\\text{train}}(x)$ and use its output to flag missing-path candidates. Keeping the prediction model and the OOD detector separate makes the OOD logic reusable across model versions.<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">Common choices: <strong>Isolation Forest (IForest)<\/strong> (Liu 2008) \u2014 fast and effective on tabular data; <strong>Local Outlier Factor (LOF)<\/strong> \u2014 local-density relative outlier score; <strong>One-Class Support Vector Machine (One-Class SVM)<\/strong> \u2014 boundary of the normal region from positive-only data; <strong>Gaussian Mixture Model (GMM)<\/strong> or <strong>Kernel Density Estimation (KDE)<\/strong> \u2014 direct density estimation.<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">The big win: this approach <strong>tackles joint-distribution OOD directly<\/strong>, catching exactly the sparse joint regions that are missing-path candidates regardless of marginal coverage.<\/p>\n\n\n\n<p class=\"wp-block-paragraph\"><strong>Numeric thresholds.<\/strong> IForest&#8217;s <code>score_samples()<\/code> returns higher values for normal points (close to 0) and very negative values for anomalies. GMM&#8217;s <code>score_samples()<\/code> returns log-likelihood (higher = normal).<\/p>\n\n\n\n<ul class=\"wp-block-list\">\n<li><strong>Good:<\/strong> IForest score \u2265 25th percentile on training data, or GMM log-likelihood \u2265 training median.<\/li>\n\n\n\n<li><strong>Borderline:<\/strong> IForest score between the 5th and 25th percentile.<\/li>\n\n\n\n<li><strong>Bad:<\/strong> IForest score &lt; 5th percentile (often below \u22120.1), or GMM log-likelihood &gt; 3 standard deviations below the training mean.<\/li>\n<\/ul>\n\n\n<div class=\"wp-block-syntaxhighlighter-code \"><pre class=\"brush: python; title: ; notranslate\" title=\"\">\nimport numpy as np\nimport xgboost as xgb\nfrom sklearn.ensemble import IsolationForest\nfrom sklearn.mixture import GaussianMixture\nfrom sklearn.datasets import make_regression\nfrom sklearn.model_selection import train_test_split\n\n\nclass OODDetector:\n    &quot;&quot;&quot;Unified interface over density-based OOD detectors.&quot;&quot;&quot;\n\n    def __init__(self, method=&quot;iforest&quot;):\n        self.method = method\n        self.model = None\n        self.threshold_5 = None\n        self.threshold_25 = None\n\n    def fit(self, X_train):\n        if self.method == &quot;iforest&quot;:\n            self.model = IsolationForest(contamination=&quot;auto&quot;, random_state=42)\n            self.model.fit(X_train)\n        elif self.method == &quot;gmm&quot;:\n            self.model = GaussianMixture(n_components=5, random_state=42)\n            self.model.fit(X_train)\n        train_scores = self.model.score_samples(X_train)\n        self.threshold_5 = np.quantile(train_scores, 0.05)\n        self.threshold_25 = np.quantile(train_scores, 0.25)\n        return self\n\n    def score(self, X):\n        return self.model.score_samples(X)\n\n    def classify(self, X):\n        scores = self.score(X)\n        labels = &#x5B;]\n        for s in scores:\n            if s &gt;= self.threshold_25:\n                labels.append(&quot;good&quot;)\n            elif s &gt;= self.threshold_5:\n                labels.append(&quot;borderline&quot;)\n            else:\n                labels.append(&quot;bad&quot;)\n        return labels, scores\n\n\nclass OODAwareTreeModel:\n    def __init__(self, tree_model, ood_detector):\n        self.tree_model = tree_model\n        self.ood_detector = ood_detector\n\n    def predict_with_ood(self, X):\n        labels, scores = self.ood_detector.classify(X)\n        y_pred = self.tree_model.predict(xgb.DMatrix(X))\n        return y_pred, labels, scores\n\n<\/pre><\/div>\n\n\n<p class=\"wp-block-paragraph\"><strong>Use in Section 5.<\/strong> The good\/borderline\/bad labels are the most general routing signal \u2014 they drive <em>5.1, 5.2,<\/em> and <em>5.4<\/em>, and combine cleanly with the other techniques.<\/p>\n<\/div><\/div>\n\n\n\n<h3 class=\"wp-block-heading\">4.4 Conformal Prediction<\/h3>\n\n\n<style>.kadence-column6561_014a46-2d > .kt-inside-inner-col,.kadence-column6561_014a46-2d > .kt-inside-inner-col:before{border-top-left-radius:0px;border-top-right-radius:0px;border-bottom-right-radius:0px;border-bottom-left-radius:0px;}.kadence-column6561_014a46-2d > .kt-inside-inner-col{column-gap:var(--global-kb-gap-sm, 1rem);}.kadence-column6561_014a46-2d > .kt-inside-inner-col{flex-direction:column;}.kadence-column6561_014a46-2d > .kt-inside-inner-col > .aligncenter{width:100%;}.kadence-column6561_014a46-2d > .kt-inside-inner-col:before{opacity:0.3;}.kadence-column6561_014a46-2d{position:relative;}.kadence-column6561_014a46-2d, .kt-inside-inner-col > .kadence-column6561_014a46-2d:not(.specificity){margin-left:var(--global-kb-spacing-sm, 1.5rem);}@media all and (max-width: 1024px){.kadence-column6561_014a46-2d > .kt-inside-inner-col{flex-direction:column;justify-content:center;}}@media all and (max-width: 767px){.kadence-column6561_014a46-2d > .kt-inside-inner-col{flex-direction:column;justify-content:center;}}<\/style>\n<div class=\"wp-block-kadence-column kadence-column6561_014a46-2d\"><div class=\"kt-inside-inner-col\">\n<p class=\"wp-block-paragraph\"><strong>Principle.<\/strong> Conformal prediction (Vovk 2005) gives <strong>distribution-free, statistically guaranteed prediction intervals<\/strong> regardless of the underlying model. The idea: take residuals on a calibration set, pick the $1-\\alpha$ quantile, and add that margin to the point prediction.<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">Formally, the nonconformity score is $s_i = |y_i &#8211; \\hat{y}_i|$ on a calibration set; with quantile $q$ at confidence $1-\\alpha$, the PI is:<\/p>\n\n\n\n<p style=\"background-color: #fff; border: none\">$$\\text{PI}(x^*) = [\\hat{y}^* &#8211; q,\\; \\hat{y}^* + q]$$<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">Coverage of $1-\\alpha$ is guaranteed under exchangeability of calibration and test data. <strong>Conformalized Quantile Regression (CQR)<\/strong> (Romano 2019) layers conformal correction on top of quantile regression and naturally produces wider intervals in missing-path regions. The <strong>MAPIE<\/strong> (Model Agnostic Prediction Interval Estimator) library (Taquet 2022) provides a scikit-learn-compatible API working with any tree-based model as base estimator.<\/p>\n\n\n\n<p class=\"wp-block-paragraph\"><strong>Numeric thresholds (at $1-\\alpha = 0.9$).<\/strong><\/p>\n\n\n\n<ul class=\"wp-block-list\">\n<li><strong>Good:<\/strong> PI width \u2264 1.2 \u00d7 median training PI width.<\/li>\n\n\n\n<li><strong>Borderline:<\/strong> 1.2 \u2013 2.5 \u00d7 median.<\/li>\n\n\n\n<li><strong>Bad:<\/strong> &gt; 2.5 \u00d7 median. With CQR, missing-path regions often inflate to 3 \u2013 10\u00d7 normal width.<\/li>\n<\/ul>\n\n\n<div class=\"wp-block-syntaxhighlighter-code \"><pre class=\"brush: python; title: ; notranslate\" title=\"\">\n&lt;pre class=&quot;wp-block-syntaxhighlighter-code&quot;&gt;import numpy as np\nfrom sklearn.datasets import make_regression\nfrom sklearn.model_selection import train_test_split\nimport lightgbm as lgb\n# pip install mapie\nfrom mapie.regression import MapieRegressor, MapieQuantileRegressor\ndef split_conformal_lgb(X_train, y_train, X_cal, y_cal, X_test, alpha=0.1):\n    &quot;&quot;&quot;Split conformal prediction with LightGBM.&quot;&quot;&quot;\n    model = lgb.LGBMRegressor(n_estimators=100, verbose=-1)\n    model.fit(X_train, y_train)\n    cal_pred = model.predict(X_cal)\n    residuals = np.abs(y_cal - cal_pred)\n    q = np.quantile(residuals, 1 - alpha)\n    test_pred = model.predict(X_test)\n    return test_pred, test_pred - q, test_pred + q\ndef cqr_lgb(X_train, y_train, X_test, alpha=0.1):\n    &quot;&quot;&quot;Conformalized Quantile Regression \u2014 wider PIs in missing-path regions.&quot;&quot;&quot;\n    model = lgb.LGBMRegressor(objective=&quot;quantile&quot;, n_estimators=200, verbose=-1)\n    mapie_qr = MapieQuantileRegressor(estimator=model, alpha=alpha)\n    mapie_qr.fit(X_train, y_train)\n    y_pred, y_pis = mapie_qr.predict(X_test)\n    return y_pred, y_pis&#x5B;:, 0, 0], y_pis&#x5B;:, 1, 0]\ndef classify_by_conformal_pi(width, train_widths):\n    median = np.median(train_widths)\n    if width &lt;= median * 1.2:\n        return &quot;good&quot;\n    elif width &lt;= median * 2.5:\n        return &quot;borderline&quot;\n    else:\n        return &quot;bad&quot;\n&lt;\/pre&gt;\n<\/pre><\/div>\n\n\n<p class=\"wp-block-paragraph\"><strong>Use in Section 5.<\/strong> The conformal PI is the output of <em>5.3 Wider PI<\/em>. PI width also signals <em>5.1 Reject<\/em>. Statistical guarantees make this the <strong>standard tool in high-stakes domains<\/strong> (healthcare, finance).<\/p>\n<\/div><\/div>\n\n\n\n<h3 class=\"wp-block-heading\">4.5 Range \/ Domain-Rule Detection<\/h3>\n\n\n<style>.kadence-column6561_dc68a4-8c > .kt-inside-inner-col,.kadence-column6561_dc68a4-8c > .kt-inside-inner-col:before{border-top-left-radius:0px;border-top-right-radius:0px;border-bottom-right-radius:0px;border-bottom-left-radius:0px;}.kadence-column6561_dc68a4-8c > .kt-inside-inner-col{column-gap:var(--global-kb-gap-sm, 1rem);}.kadence-column6561_dc68a4-8c > .kt-inside-inner-col{flex-direction:column;}.kadence-column6561_dc68a4-8c > .kt-inside-inner-col > .aligncenter{width:100%;}.kadence-column6561_dc68a4-8c > .kt-inside-inner-col:before{opacity:0.3;}.kadence-column6561_dc68a4-8c{position:relative;}.kadence-column6561_dc68a4-8c, .kt-inside-inner-col > .kadence-column6561_dc68a4-8c:not(.specificity){margin-left:var(--global-kb-spacing-sm, 1.5rem);}@media all and (max-width: 1024px){.kadence-column6561_dc68a4-8c > .kt-inside-inner-col{flex-direction:column;justify-content:center;}}@media all and (max-width: 767px){.kadence-column6561_dc68a4-8c > .kt-inside-inner-col{flex-direction:column;justify-content:center;}}<\/style>\n<div class=\"wp-block-kadence-column kadence-column6561_dc68a4-8c\"><div class=\"kt-inside-inner-col\">\n<p class=\"wp-block-paragraph\"><strong>Principle.<\/strong> The simplest, most direct check: record each feature&#8217;s training range and flag any inference sample falling outside it. Add explicit cross-feature constraints (e.g., <code>start_date &lt; end_date<\/code>, <code>age \u2265 0<\/code>) when domain knowledge supplies them.<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">This isn&#8217;t sophisticated, but it is the right first line of defense in production: nearly free, fast, with very low false-positive rate (marginal-support violations are unambiguous OOD). It can also handle conditional ranges \u2014 if all training samples with <code>age &lt; 18<\/code> had <code>income<\/code> in [0, 50K], an inference sample with <code>age=15, income=200K<\/code> violates that conditional range.<\/p>\n\n\n\n<p class=\"wp-block-paragraph\"><strong>Numeric thresholds.<\/strong><\/p>\n\n\n\n<ul class=\"wp-block-list\">\n<li><strong>Good:<\/strong> all features within [1st percentile, 99th percentile] and no conditional violations.<\/li>\n\n\n\n<li><strong>Borderline:<\/strong> at least one feature outside [1st, 99th] but inside [min, max], or one conditional violation.<\/li>\n\n\n\n<li><strong>Bad:<\/strong> at least one feature outside training [min, max], two or more conditional violations, or an unseen categorical value.<\/li>\n<\/ul>\n\n\n<div class=\"wp-block-syntaxhighlighter-code \"><pre class=\"brush: python; title: ; notranslate\" title=\"\">\nimport numpy as np\nimport pandas as pd\n\n\nclass RangeBasedOODDetector:\n    &quot;&quot;&quot;OOD detector based on feature ranges and conditional rules.&quot;&quot;&quot;\n\n    def __init__(self, conditional_rules=None):\n        self.feature_min = None\n        self.feature_max = None\n        self.feature_p1 = None\n        self.feature_p99 = None\n        self.cat_values = {}\n        self.conditional_rules = conditional_rules or &#x5B;]\n\n    def fit(self, X_train, categorical_cols=None):\n        if isinstance(X_train, pd.DataFrame):\n            num_data = X_train.select_dtypes(include=&#x5B;np.number]).values\n            num_cols = X_train.select_dtypes(include=&#x5B;np.number]).columns\n            self.num_cols = list(num_cols)\n            if categorical_cols:\n                for col in categorical_cols:\n                    self.cat_values&#x5B;col] = set(X_train&#x5B;col].unique())\n        else:\n            num_data = np.asarray(X_train)\n            self.num_cols = list(range(num_data.shape&#x5B;1]))\n\n        self.feature_min = num_data.min(axis=0)\n        self.feature_max = num_data.max(axis=0)\n        self.feature_p1 = np.quantile(num_data, 0.01, axis=0)\n        self.feature_p99 = np.quantile(num_data, 0.99, axis=0)\n        return self\n\n    def classify_one(self, row):\n        if isinstance(row, pd.Series):\n            num_vals = row&#x5B;self.num_cols].values\n        else:\n            num_vals = np.asarray(row)\n\n        hard = ((num_vals &lt; self.feature_min) | (num_vals &gt; self.feature_max)).sum()\n        soft = ((num_vals &lt; self.feature_p1) | (num_vals &gt; self.feature_p99)).sum()\n\n        cat_violations = 0\n        if isinstance(row, pd.Series):\n            for col, valid in self.cat_values.items():\n                if row&#x5B;col] not in valid:\n                    cat_violations += 1\n\n        cond = 0\n        for rule in self.conditional_rules:\n            if rule(row):\n                cond += 1\n\n        if hard &gt; 0 or cat_violations &gt; 0 or cond &gt;= 2:\n            return &quot;bad&quot;\n        elif soft &gt; 0 or cond == 1:\n            return &quot;borderline&quot;\n        else:\n            return &quot;good&quot;\n\n    def classify(self, X):\n        if isinstance(X, pd.DataFrame):\n            return &#x5B;self.classify_one(row) for _, row in X.iterrows()]\n        else:\n            return &#x5B;self.classify_one(row) for row in X]\n\n<\/pre><\/div>\n\n\n<p class=\"wp-block-paragraph\"><strong>Use in Section 5.<\/strong> Acts as a fast first gate: hard violations trigger immediate rejection in <em>5.1<\/em>. Being deterministic, it is also the easiest to monitor in production.<\/p>\n<\/div><\/div>\n\n\n\n<h3 class=\"wp-block-heading\">4.6 Comparison Summary<\/h3>\n\n\n<style>.kadence-column6561_0984a2-98 > .kt-inside-inner-col,.kadence-column6561_0984a2-98 > .kt-inside-inner-col:before{border-top-left-radius:0px;border-top-right-radius:0px;border-bottom-right-radius:0px;border-bottom-left-radius:0px;}.kadence-column6561_0984a2-98 > .kt-inside-inner-col{column-gap:var(--global-kb-gap-sm, 1rem);}.kadence-column6561_0984a2-98 > .kt-inside-inner-col{flex-direction:column;}.kadence-column6561_0984a2-98 > .kt-inside-inner-col > .aligncenter{width:100%;}.kadence-column6561_0984a2-98 > .kt-inside-inner-col:before{opacity:0.3;}.kadence-column6561_0984a2-98{position:relative;}.kadence-column6561_0984a2-98, .kt-inside-inner-col > .kadence-column6561_0984a2-98:not(.specificity){margin-left:var(--global-kb-spacing-sm, 1.5rem);}@media all and (max-width: 1024px){.kadence-column6561_0984a2-98 > .kt-inside-inner-col{flex-direction:column;justify-content:center;}}@media all and (max-width: 767px){.kadence-column6561_0984a2-98 > .kt-inside-inner-col{flex-direction:column;justify-content:center;}}<\/style>\n<div class=\"wp-block-kadence-column kadence-column6561_0984a2-98\"><div class=\"kt-inside-inner-col\">\n<figure class=\"wp-block-table\"><table><thead><tr><th>Technique<\/th><th>Output<\/th><th>Good \/ Bad criterion<\/th><th>Extra model<\/th><th>Inference cost<\/th><\/tr><\/thead><tbody><tr><td>4.1 Leaf coverage<\/td><td>Avg training samples per leaf<\/td><td>5th \/ 25th percentile of training scores<\/td><td>None<\/td><td>Low<\/td><\/tr><tr><td>4.2 Forest variance<\/td><td>PI width or ensemble std<\/td><td>1.5\u00d7 \/ 3\u00d7 training median<\/td><td>Sometimes<\/td><td>Mid \u2013 High<\/td><\/tr><tr><td>4.3 Density detector<\/td><td>IForest \/ GMM score<\/td><td>5th \/ 25th percentile of training scores<\/td><td>Required<\/td><td>Low \u2013 Mid<\/td><\/tr><tr><td>4.4 Conformal<\/td><td>Guaranteed PI width<\/td><td>1.2\u00d7 \/ 2.5\u00d7 training median<\/td><td>Calibration set required<\/td><td>Mid<\/td><\/tr><tr><td>4.5 Range \/ domain rule<\/td><td>Number of violations<\/td><td>\u22651 hard \/ \u22651 soft<\/td><td>None<\/td><td>Very low<\/td><\/tr><\/tbody><\/table><\/figure>\n<\/div><\/div>\n<\/div><\/div>\n\n\n\n<h2 class=\"wp-block-heading\">5. Mitigations Driven by Detection Output<\/h2>\n\n\n<style>.kadence-column6561_35d593-67 > .kt-inside-inner-col,.kadence-column6561_35d593-67 > .kt-inside-inner-col:before{border-top-left-radius:0px;border-top-right-radius:0px;border-bottom-right-radius:0px;border-bottom-left-radius:0px;}.kadence-column6561_35d593-67 > .kt-inside-inner-col{column-gap:var(--global-kb-gap-sm, 1rem);}.kadence-column6561_35d593-67 > .kt-inside-inner-col{flex-direction:column;}.kadence-column6561_35d593-67 > .kt-inside-inner-col > .aligncenter{width:100%;}.kadence-column6561_35d593-67 > .kt-inside-inner-col:before{opacity:0.3;}.kadence-column6561_35d593-67{position:relative;}.kadence-column6561_35d593-67, .kt-inside-inner-col > .kadence-column6561_35d593-67:not(.specificity){margin-left:var(--global-kb-spacing-sm, 1.5rem);}@media all and (max-width: 1024px){.kadence-column6561_35d593-67 > .kt-inside-inner-col{flex-direction:column;justify-content:center;}}@media all and (max-width: 767px){.kadence-column6561_35d593-67 > .kt-inside-inner-col{flex-direction:column;justify-content:center;}}<\/style>\n<div class=\"wp-block-kadence-column kadence-column6561_35d593-67\"><div class=\"kt-inside-inner-col\">\n<p class=\"wp-block-paragraph\">The detection techniques in Section 4 emit good\/borderline\/bad labels and confidence scores. This section turns those outputs into concrete system behaviors. Mitigations are not used in isolation \u2014 they are <strong>composed in layers<\/strong>.<\/p>\n\n\n\n<h3 class=\"wp-block-heading\">5.1 Reject Option<\/h3>\n\n\n<style>.kadence-column6561_ed925e-c9 > .kt-inside-inner-col,.kadence-column6561_ed925e-c9 > .kt-inside-inner-col:before{border-top-left-radius:0px;border-top-right-radius:0px;border-bottom-right-radius:0px;border-bottom-left-radius:0px;}.kadence-column6561_ed925e-c9 > .kt-inside-inner-col{column-gap:var(--global-kb-gap-sm, 1rem);}.kadence-column6561_ed925e-c9 > .kt-inside-inner-col{flex-direction:column;}.kadence-column6561_ed925e-c9 > .kt-inside-inner-col > .aligncenter{width:100%;}.kadence-column6561_ed925e-c9 > .kt-inside-inner-col:before{opacity:0.3;}.kadence-column6561_ed925e-c9{position:relative;}.kadence-column6561_ed925e-c9, .kt-inside-inner-col > .kadence-column6561_ed925e-c9:not(.specificity){margin-left:var(--global-kb-spacing-sm, 1.5rem);}@media all and (max-width: 1024px){.kadence-column6561_ed925e-c9 > .kt-inside-inner-col{flex-direction:column;justify-content:center;}}@media all and (max-width: 767px){.kadence-column6561_ed925e-c9 > .kt-inside-inner-col{flex-direction:column;justify-content:center;}}<\/style>\n<div class=\"wp-block-kadence-column kadence-column6561_ed925e-c9\"><div class=\"kt-inside-inner-col\">\n<p class=\"wp-block-paragraph\">The most conservative response: when detection returns <strong>bad<\/strong>, the model abstains. The caller falls back to a human or a safe default action.<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">Don&#8217;t gate rejection on a single detector \u2014 use <strong>voting across detectors<\/strong> (e.g., reject only if at least two detectors flag <em>bad<\/em>) or a weighted score. This cuts the false-positive rate that one over-sensitive detector would otherwise inject.<\/p>\n\n\n<div class=\"wp-block-syntaxhighlighter-code \"><pre class=\"brush: python; title: ; notranslate\" title=\"\">\ndef safe_predict_with_reject(model, x, detectors):\n    &quot;&quot;&quot;Reject by consensus across multiple detectors.&quot;&quot;&quot;\n    labels = {name: det.classify(&#x5B;x])&#x5B;0] for name, det in detectors.items()}\n    bad_count = sum(1 for v in labels.values() if v == &quot;bad&quot;)\n    if bad_count &gt;= 2:\n        return None, &quot;REJECTED&quot;, labels\n    elif any(v == &quot;bad&quot; for v in labels.values()):\n        return model.predict(&#x5B;x])&#x5B;0], &quot;WARNING&quot;, labels\n    else:\n        return model.predict(&#x5B;x])&#x5B;0], &quot;OK&quot;, labels\n\n<\/pre><\/div>\n\n\n<p class=\"wp-block-paragraph\"><strong>When to use:<\/strong> medical diagnosis, autonomous-driving safety decisions, automated financial approvals \u2014 anywhere a wrong prediction is much more expensive than a missed one. Track rejection rate as a key Key Performance Indicator (KPI).<\/p>\n<\/div><\/div>\n\n\n\n<h3 class=\"wp-block-heading\">5.2 Hybrid Fallback Model<\/h3>\n\n\n<style>.kadence-column6561_4d0c90-d9 > .kt-inside-inner-col,.kadence-column6561_4d0c90-d9 > .kt-inside-inner-col:before{border-top-left-radius:0px;border-top-right-radius:0px;border-bottom-right-radius:0px;border-bottom-left-radius:0px;}.kadence-column6561_4d0c90-d9 > .kt-inside-inner-col{column-gap:var(--global-kb-gap-sm, 1rem);}.kadence-column6561_4d0c90-d9 > .kt-inside-inner-col{flex-direction:column;}.kadence-column6561_4d0c90-d9 > .kt-inside-inner-col > .aligncenter{width:100%;}.kadence-column6561_4d0c90-d9 > .kt-inside-inner-col:before{opacity:0.3;}.kadence-column6561_4d0c90-d9{position:relative;}.kadence-column6561_4d0c90-d9, .kt-inside-inner-col > .kadence-column6561_4d0c90-d9:not(.specificity){margin-left:var(--global-kb-spacing-sm, 1.5rem);}@media all and (max-width: 1024px){.kadence-column6561_4d0c90-d9 > .kt-inside-inner-col{flex-direction:column;justify-content:center;}}@media all and (max-width: 767px){.kadence-column6561_4d0c90-d9 > .kt-inside-inner-col{flex-direction:column;justify-content:center;}}<\/style>\n<div class=\"wp-block-kadence-column kadence-column6561_4d0c90-d9\"><div class=\"kt-inside-inner-col\">\n<p class=\"wp-block-paragraph\">Rejection is safe but bad for user experience. Alternative: route <strong>bad\/borderline samples to a more conservative model<\/strong> (linear, Generalized Additive Model \u2014 GAM, or a domain mean) and use the tree model only for <em>good<\/em> samples.<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">The trade-off is intentional: the tree wins inside the training distribution; the linear model loses accuracy but extrapolates predictably outside it. Combining them captures the strengths of both.<\/p>\n\n\n<div class=\"wp-block-syntaxhighlighter-code \"><pre class=\"brush: python; title: ; notranslate\" title=\"\">\nimport numpy as np\nfrom sklearn.linear_model import Ridge\nimport lightgbm as lgb\n\n\nclass HybridFallbackModel:\n    def __init__(self, primary, fallback, detector):\n        self.primary = primary    # tree model\n        self.fallback = fallback  # linear model\n        self.detector = detector\n\n    def fit(self, X, y):\n        self.primary.fit(X, y)\n        self.fallback.fit(X, y)\n        self.detector.fit(X)\n        return self\n\n    def predict(self, X):\n        labels = self.detector.classify(X)\n        primary_pred = self.primary.predict(X)\n        fallback_pred = self.fallback.predict(X)\n        is_safe = np.array(&#x5B;l == &quot;good&quot; for l in labels])\n        return np.where(is_safe, primary_pred, fallback_pred), labels\n\n<\/pre><\/div>\n\n\n<p class=\"wp-block-paragraph\"><strong>When to use:<\/strong> recommendation, pricing, demand forecasting \u2014 systems that must always return something, where missing answers cost more than a small accuracy hit.<\/p>\n<\/div><\/div>\n\n\n\n<h3 class=\"wp-block-heading\">5.3 Wider Prediction Interval<\/h3>\n\n\n<style>.kadence-column6561_f838d4-5d > .kt-inside-inner-col,.kadence-column6561_f838d4-5d > .kt-inside-inner-col:before{border-top-left-radius:0px;border-top-right-radius:0px;border-bottom-right-radius:0px;border-bottom-left-radius:0px;}.kadence-column6561_f838d4-5d > .kt-inside-inner-col{column-gap:var(--global-kb-gap-sm, 1rem);}.kadence-column6561_f838d4-5d > .kt-inside-inner-col{flex-direction:column;}.kadence-column6561_f838d4-5d > .kt-inside-inner-col > .aligncenter{width:100%;}.kadence-column6561_f838d4-5d > .kt-inside-inner-col:before{opacity:0.3;}.kadence-column6561_f838d4-5d{position:relative;}.kadence-column6561_f838d4-5d, .kt-inside-inner-col > .kadence-column6561_f838d4-5d:not(.specificity){margin-left:var(--global-kb-spacing-sm, 1.5rem);}@media all and (max-width: 1024px){.kadence-column6561_f838d4-5d > .kt-inside-inner-col{flex-direction:column;justify-content:center;}}@media all and (max-width: 767px){.kadence-column6561_f838d4-5d > .kt-inside-inner-col{flex-direction:column;justify-content:center;}}<\/style>\n<div class=\"wp-block-kadence-column kadence-column6561_f838d4-5d\"><div class=\"kt-inside-inner-col\">\n<p class=\"wp-block-paragraph\">Return a <strong>PI together with the point estimate<\/strong> so downstream systems can read off uncertainty directly. The PIs from 4.2 and 4.4 are reused as-is; missing-path regions naturally produce wider intervals.<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">The shift here is at the output-format level: replace &#8220;prediction = 100&#8221; with &#8220;90% PI = [60, 140]&#8221;. Downstream services can then judge confidence automatically from PI width.<\/p>\n\n\n<div class=\"wp-block-syntaxhighlighter-code \"><pre class=\"brush: python; title: ; notranslate\" title=\"\">\nimport numpy as np\nimport lightgbm as lgb\nfrom mapie.regression import MapieQuantileRegressor\n\n\nclass IntervalPredictor:\n    def __init__(self, alpha=0.1):\n        self.alpha = alpha\n        model = lgb.LGBMRegressor(objective=&quot;quantile&quot;, n_estimators=200, verbose=-1)\n        self.mapie = MapieQuantileRegressor(estimator=model, alpha=alpha)\n\n    def fit(self, X, y):\n        self.mapie.fit(X, y)\n        return self\n\n    def predict_interval(self, X):\n        y_pred, y_pis = self.mapie.predict(X)\n        return {\n            &quot;point&quot;: y_pred,\n            &quot;lower&quot;: y_pis&#x5B;:, 0, 0],\n            &quot;upper&quot;: y_pis&#x5B;:, 1, 0],\n            &quot;width&quot;: y_pis&#x5B;:, 1, 0] - y_pis&#x5B;:, 0, 0],\n        }\n\n<\/pre><\/div>\n\n\n<p class=\"wp-block-paragraph\"><strong>When to use:<\/strong> simulation, risk analysis, decision-support \u2014 environments where the user can interpret uncertainty themselves.<\/p>\n<\/div><\/div>\n\n\n\n<h3 class=\"wp-block-heading\">5.4 Human-in-the-Loop Escalation<\/h3>\n\n\n<style>.kadence-column6561_bff1dc-7a > .kt-inside-inner-col,.kadence-column6561_bff1dc-7a > .kt-inside-inner-col:before{border-top-left-radius:0px;border-top-right-radius:0px;border-bottom-right-radius:0px;border-bottom-left-radius:0px;}.kadence-column6561_bff1dc-7a > .kt-inside-inner-col{column-gap:var(--global-kb-gap-sm, 1rem);}.kadence-column6561_bff1dc-7a > .kt-inside-inner-col{flex-direction:column;}.kadence-column6561_bff1dc-7a > .kt-inside-inner-col > .aligncenter{width:100%;}.kadence-column6561_bff1dc-7a > .kt-inside-inner-col:before{opacity:0.3;}.kadence-column6561_bff1dc-7a{position:relative;}.kadence-column6561_bff1dc-7a, .kt-inside-inner-col > .kadence-column6561_bff1dc-7a:not(.specificity){margin-left:var(--global-kb-spacing-sm, 1.5rem);}@media all and (max-width: 1024px){.kadence-column6561_bff1dc-7a > .kt-inside-inner-col{flex-direction:column;justify-content:center;}}@media all and (max-width: 767px){.kadence-column6561_bff1dc-7a > .kt-inside-inner-col{flex-direction:column;justify-content:center;}}<\/style>\n<div class=\"wp-block-kadence-column kadence-column6561_bff1dc-7a\"><div class=\"kt-inside-inner-col\">\n<p class=\"wp-block-paragraph\">The most cautious mitigation: <strong>bad samples go to a human review queue<\/strong> instead of getting an automatic prediction. Once labeled, they feed retraining. This is the canonical OOD continual-learning pipeline.<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">Operational flow: (1) detector labels <em>bad<\/em> \u2192 (2) return a default or conservative response \u2192 (3) push the sample plus context to a review queue \u2192 (4) human labels it \u2192 (5) labeled samples enter the next retraining batch \u2192 (6) model and detectors are both refreshed.<\/p>\n\n\n<div class=\"wp-block-syntaxhighlighter-code \"><pre class=\"brush: python; title: ; notranslate\" title=\"\">\nimport json\nfrom pathlib import Path\nfrom datetime import datetime\n\n\nclass HumanReviewQueue:\n    def __init__(self, queue_path=&quot;ood_review_queue.jsonl&quot;):\n        self.queue_path = Path(queue_path)\n\n    def add_sample(self, x, model_pred, ood_score, detector_labels, context=None):\n        record = {\n            &quot;timestamp&quot;: datetime.now().isoformat(),\n            &quot;input&quot;: x.tolist() if hasattr(x, &quot;tolist&quot;) else list(x),\n            &quot;model_prediction&quot;: float(model_pred) if model_pred is not None else None,\n            &quot;ood_score&quot;: float(ood_score),\n            &quot;detector_labels&quot;: detector_labels,\n            &quot;context&quot;: context or {},\n            &quot;human_label&quot;: None,\n            &quot;reviewed&quot;: False,\n        }\n        with open(self.queue_path, &quot;a&quot;) as f:\n            f.write(json.dumps(record) + &quot;\\n&quot;)\n\n    def load_reviewed(self):\n        if not self.queue_path.exists():\n            return &#x5B;]\n        results = &#x5B;]\n        with open(self.queue_path) as f:\n            for line in f:\n                rec = json.loads(line)\n                if rec&#x5B;&quot;reviewed&quot;] and rec&#x5B;&quot;human_label&quot;] is not None:\n                    results.append(rec)\n        return results\n\n\nclass SafeProductionModel:\n    &quot;&quot;&quot;Detection + reject + review queue, combined for production.&quot;&quot;&quot;\n\n    def __init__(self, model, detectors, queue):\n        self.model = model\n        self.detectors = detectors\n        self.queue = queue\n\n    def serve(self, x):\n        labels = {name: det.classify(&#x5B;x])&#x5B;0] for name, det in self.detectors.items()}\n        bad_count = sum(1 for v in labels.values() if v == &quot;bad&quot;)\n\n        if bad_count &gt;= 2:\n            self.queue.add_sample(x, None, -1.0, labels)\n            return {&quot;prediction&quot;: None, &quot;status&quot;: &quot;REJECTED_FOR_REVIEW&quot;}\n        elif bad_count == 1:\n            pred = self.model.predict(&#x5B;x])&#x5B;0]\n            self.queue.add_sample(x, pred, 0.0, labels, context={&quot;warning&quot;: True})\n            return {&quot;prediction&quot;: pred, &quot;status&quot;: &quot;PREDICTED_WITH_WARNING&quot;}\n        else:\n            return {&quot;prediction&quot;: self.model.predict(&#x5B;x])&#x5B;0], &quot;status&quot;: &quot;OK&quot;}\n\n<\/pre><\/div>\n\n\n<p class=\"wp-block-paragraph\"><strong>When to use:<\/strong> medical imaging, fraud detection, security anomaly detection, customer-service automation \u2014 domains where wrong predictions cost far more than human review.<\/p>\n<\/div><\/div>\n\n\n\n<h3 class=\"wp-block-heading\">5.5 Retraining Trigger (Continual Learning)<\/h3>\n\n\n<style>.kadence-column6561_3aed47-b6 > .kt-inside-inner-col,.kadence-column6561_3aed47-b6 > .kt-inside-inner-col:before{border-top-left-radius:0px;border-top-right-radius:0px;border-bottom-right-radius:0px;border-bottom-left-radius:0px;}.kadence-column6561_3aed47-b6 > .kt-inside-inner-col{column-gap:var(--global-kb-gap-sm, 1rem);}.kadence-column6561_3aed47-b6 > .kt-inside-inner-col{flex-direction:column;}.kadence-column6561_3aed47-b6 > .kt-inside-inner-col > .aligncenter{width:100%;}.kadence-column6561_3aed47-b6 > .kt-inside-inner-col:before{opacity:0.3;}.kadence-column6561_3aed47-b6{position:relative;}.kadence-column6561_3aed47-b6, .kt-inside-inner-col > .kadence-column6561_3aed47-b6:not(.specificity){margin-left:var(--global-kb-spacing-sm, 1.5rem);}@media all and (max-width: 1024px){.kadence-column6561_3aed47-b6 > .kt-inside-inner-col{flex-direction:column;justify-content:center;}}@media all and (max-width: 767px){.kadence-column6561_3aed47-b6 > .kt-inside-inner-col{flex-direction:column;justify-content:center;}}<\/style>\n<div class=\"wp-block-kadence-column kadence-column6561_3aed47-b6\"><div class=\"kt-inside-inner-col\">\n<p class=\"wp-block-paragraph\">A <strong>rising rate of bad\/borderline labels over time<\/strong> is a strong signal that the model is going stale. Monitor it and trigger retraining automatically.<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">Typical pattern: aggregate the OOD ratio per hour or day; raise an alert (or fire a retraining job) when it exceeds a threshold; combine the labeled samples from the 5.4 review queue with existing training data and refit. Promote the new model only if it passes a regression test.<\/p>\n\n\n<div class=\"wp-block-syntaxhighlighter-code \"><pre class=\"brush: python; title: ; notranslate\" title=\"\">\nimport numpy as np\n\n\nclass DriftMonitor:\n    def __init__(self, ood_ratio_threshold=0.1, retrain_callback=None):\n        self.threshold = ood_ratio_threshold\n        self.retrain_callback = retrain_callback\n        self.window = &#x5B;]\n        self.window_size = 1000\n\n    def record(self, label):\n        self.window.append(1 if label == &quot;bad&quot; else 0)\n        if len(self.window) &gt; self.window_size:\n            self.window.pop(0)\n\n    def check_and_trigger(self):\n        if len(self.window) &lt; self.window_size:\n            return False\n        ood_ratio = np.mean(self.window)\n        if ood_ratio &gt; self.threshold:\n            print(f&quot;OOD ratio={ood_ratio:.2%} exceeds threshold. Triggering retrain.&quot;)\n            if self.retrain_callback:\n                self.retrain_callback()\n            self.window = &#x5B;]\n            return True\n        return False\n\n<\/pre><\/div>\n\n\n<p class=\"wp-block-paragraph\"><strong>When to use:<\/strong> any long-running deployment, especially where user behavior, market conditions, or sensor environments drift over time.<\/p>\n<\/div><\/div>\n\n\n\n<h3 class=\"wp-block-heading\">5.6 Recommended Layered Patterns<\/h3>\n\n\n<style>.kadence-column6561_71f463-a0 > .kt-inside-inner-col,.kadence-column6561_71f463-a0 > .kt-inside-inner-col:before{border-top-left-radius:0px;border-top-right-radius:0px;border-bottom-right-radius:0px;border-bottom-left-radius:0px;}.kadence-column6561_71f463-a0 > .kt-inside-inner-col{column-gap:var(--global-kb-gap-sm, 1rem);}.kadence-column6561_71f463-a0 > .kt-inside-inner-col{flex-direction:column;}.kadence-column6561_71f463-a0 > .kt-inside-inner-col > .aligncenter{width:100%;}.kadence-column6561_71f463-a0 > .kt-inside-inner-col:before{opacity:0.3;}.kadence-column6561_71f463-a0{position:relative;}.kadence-column6561_71f463-a0, .kt-inside-inner-col > .kadence-column6561_71f463-a0:not(.specificity){margin-left:var(--global-kb-spacing-sm, 1.5rem);}@media all and (max-width: 1024px){.kadence-column6561_71f463-a0 > .kt-inside-inner-col{flex-direction:column;justify-content:center;}}@media all and (max-width: 767px){.kadence-column6561_71f463-a0 > .kt-inside-inner-col{flex-direction:column;justify-content:center;}}<\/style>\n<div class=\"wp-block-kadence-column kadence-column6561_71f463-a0\"><div class=\"kt-inside-inner-col\">\n<ul class=\"wp-block-list\">\n<li><strong>Low-risk, fast iteration:<\/strong> 4.1 (leaf coverage) + 4.5 (range rule) \u2192 5.3 (wider PI). No extra models, near-zero overhead.<\/li>\n\n\n\n<li><strong>Mid-risk, standard production:<\/strong> 4.3 (Isolation Forest) + 4.4 (CQR) \u2192 5.2 (hybrid fallback) + 5.5 (drift monitor). Joint OOD detection plus guaranteed PIs; trees stay fast on safe inputs while linears safeguard the rest.<\/li>\n\n\n\n<li><strong>High-risk, healthcare \/ finance \/ safety-critical:<\/strong> all of 4.1, 4.3, 4.4, 4.5 \u2192 5.1 (reject) + 5.4 (human-in-the-loop) + 5.5 (retrain). Multi-detector consensus drives rejection; flagged samples go to humans.<\/li>\n\n\n\n<li><strong>Monitoring across all tiers:<\/strong> log OOD scores, leaf coverage, PI width, and rejection rate as time series. Trigger retraining on drift in the <em>distribution<\/em> of these signals \u2014 far more robust than a single absolute threshold.<\/li>\n<\/ul>\n<\/div><\/div>\n<\/div><\/div>\n\n\n\n<hr class=\"wp-block-separator has-alpha-channel-opacity\" style=\"margin-top:var(--wp--preset--spacing--60);margin-bottom:var(--wp--preset--spacing--60)\"\/>\n\n\n\n<h2 class=\"wp-block-heading\">References<\/h2>\n\n\n<style>.kadence-column6561_9d9f89-58 > .kt-inside-inner-col,.kadence-column6561_9d9f89-58 > .kt-inside-inner-col:before{border-top-left-radius:0px;border-top-right-radius:0px;border-bottom-right-radius:0px;border-bottom-left-radius:0px;}.kadence-column6561_9d9f89-58 > .kt-inside-inner-col{column-gap:var(--global-kb-gap-sm, 1rem);}.kadence-column6561_9d9f89-58 > .kt-inside-inner-col{flex-direction:column;}.kadence-column6561_9d9f89-58 > .kt-inside-inner-col > .aligncenter{width:100%;}.kadence-column6561_9d9f89-58 > .kt-inside-inner-col:before{opacity:0.3;}.kadence-column6561_9d9f89-58{position:relative;}.kadence-column6561_9d9f89-58, .kt-inside-inner-col > .kadence-column6561_9d9f89-58:not(.specificity){margin-left:var(--global-kb-spacing-sm, 1.5rem);}@media all and (max-width: 1024px){.kadence-column6561_9d9f89-58 > .kt-inside-inner-col{flex-direction:column;justify-content:center;}}@media all and (max-width: 767px){.kadence-column6561_9d9f89-58 > .kt-inside-inner-col{flex-direction:column;justify-content:center;}}<\/style>\n<div class=\"wp-block-kadence-column kadence-column6561_9d9f89-58\"><div class=\"kt-inside-inner-col\">\n<ol class=\"wp-block-list\">\n<li>Liu, F. T., Ting, K. M., &amp; Zhou, Z. H. (2008). Isolation Forest. <em>ICDM<\/em>.<\/li>\n\n\n\n<li>Meinshausen, N. (2006). Quantile Regression Forests. <em>Journal of Machine Learning Research<\/em>, 7, 983-999.<\/li>\n\n\n\n<li>Romano, Y., Patterson, E., &amp; Candes, E. J. (2019). Conformalized Quantile Regression. <em>NeurIPS<\/em>.<\/li>\n\n\n\n<li>Taquet, V., Blot, V., Morzadec, T., Lacombe, L., &amp; Brunel, N. (2022). MAPIE: an open-source library for distribution-free uncertainty quantification. <em>arXiv:2207.12274<\/em>.<\/li>\n\n\n\n<li>Vovk, V., Gammerman, A., &amp; Shafer, G. (2005). <em>Algorithmic Learning in a Random World<\/em>. Springer.<\/li>\n\n\n\n<li>Yang, J., Zhou, K., Li, Y., &amp; Liu, Z. (2021). Generalized Out-of-Distribution Detection: A Survey. <em>arXiv:2110.11334<\/em>.<\/li>\n<\/ol>\n<\/div><\/div>\n<div style='text-align:center' class='yasr-auto-insert-overall'><\/div><div style='text-align:center' class='yasr-auto-insert-visitor'><\/div>","protected":false},"excerpt":{"rendered":"<p>Bottom Line Strictly speaking, no \u2014 but in practice, treat them as Out-of-Distribution (OOD). Missing-path samples in tree-based boosting models such as LightGBM, CatBoost, and XGBoost do not match the academic definition of OOD perfectly, yet they carry essentially the same risk in deployed systems. This post explains why this is a borderline case, surveys&#8230;<\/p>\n","protected":false},"author":4,"featured_media":6562,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"_bbp_topic_count":0,"_bbp_reply_count":0,"_bbp_total_topic_count":0,"_bbp_total_reply_count":0,"_bbp_voice_count":0,"_bbp_anonymous_reply_count":0,"_bbp_topic_count_hidden":0,"_bbp_reply_count_hidden":0,"_bbp_forum_subforum_count":0,"_kadence_starter_templates_imported_post":false,"_kad_post_transparent":"","_kad_post_title":"","_kad_post_layout":"","_kad_post_sidebar_id":"","_kad_post_content_style":"","_kad_post_vertical_padding":"","_kad_post_feature":"","_kad_post_feature_position":"","_kad_post_header":false,"_kad_post_footer":false,"_kad_post_classname":"","yasr_overall_rating":0,"yasr_post_is_review":"","yasr_auto_insert_disabled":"","yasr_review_type":"","fifu_image_url":"","fifu_image_alt":"","iawp_total_views":2,"footnotes":""},"categories":[56],"tags":[],"class_list":["post-6561","post","type-post","status-publish","format-standard","has-post-thumbnail","hentry","category-data-science-slug"],"yasr_visitor_votes":{"stars_attributes":{"read_only":false,"span_bottom":false},"number_of_votes":1,"sum_votes":5},"jetpack_featured_media_url":"https:\/\/ykim.synology.me\/wordpress\/wp-content\/uploads\/2026\/05\/2025122x-Endless-Blue-Sky-Santa-Fe-Highway-900x600px.jpg","_links":{"self":[{"href":"https:\/\/ykim.synology.me\/wordpress\/wp-json\/wp\/v2\/posts\/6561","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/ykim.synology.me\/wordpress\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/ykim.synology.me\/wordpress\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/ykim.synology.me\/wordpress\/wp-json\/wp\/v2\/users\/4"}],"replies":[{"embeddable":true,"href":"https:\/\/ykim.synology.me\/wordpress\/wp-json\/wp\/v2\/comments?post=6561"}],"version-history":[{"count":4,"href":"https:\/\/ykim.synology.me\/wordpress\/wp-json\/wp\/v2\/posts\/6561\/revisions"}],"predecessor-version":[{"id":6574,"href":"https:\/\/ykim.synology.me\/wordpress\/wp-json\/wp\/v2\/posts\/6561\/revisions\/6574"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/ykim.synology.me\/wordpress\/wp-json\/wp\/v2\/media\/6562"}],"wp:attachment":[{"href":"https:\/\/ykim.synology.me\/wordpress\/wp-json\/wp\/v2\/media?parent=6561"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/ykim.synology.me\/wordpress\/wp-json\/wp\/v2\/categories?post=6561"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/ykim.synology.me\/wordpress\/wp-json\/wp\/v2\/tags?post=6561"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}