{"id":6147,"date":"2026-04-08T22:30:56","date_gmt":"2026-04-09T03:30:56","guid":{"rendered":"https:\/\/ykim.synology.me\/wordpress\/?p=6147"},"modified":"2026-04-09T01:49:40","modified_gmt":"2026-04-09T06:49:40","slug":"the-impact-of-variance-components-on-the-coefficient-of-determination-r2","status":"publish","type":"post","link":"https:\/\/ykim.synology.me\/wordpress\/the-impact-of-variance-components-on-the-coefficient-of-determination-r2-6147\/","title":{"rendered":"The Impact of Variance Components on the Coefficient of Determination ($R^2$)"},"content":{"rendered":"\n<figure class=\"wp-block-image size-full is-resized\"><img loading=\"lazy\" decoding=\"async\" width=\"300\" height=\"200\" src=\"https:\/\/ykim.synology.me\/wordpress\/wp-content\/uploads\/2026\/04\/the-coefficient-of-determination-R^2-300x200px.png\" alt=\"\" class=\"wp-image-6150\" style=\"width:400px\"\/><\/figure>\n\n\n\n<h2 class=\"wp-block-heading\">1. Executive Summary<\/h2>\n\n\n\n<p class=\"wp-block-paragraph\">The Coefficient of Determination, denoted as $R^2$, is one of the most widely used metrics for assessing the goodness-of-fit in linear regression models. However, its interpretation is often fraught with misunderstanding, particularly regarding how it fluctuates not just with the &#8220;correctness&#8221; of a model, but with the underlying distribution of the data. This report explores the mathematical and conceptual reasons why changes in variance\u2014specifically residual variance ($\\sigma^2_{\\epsilon}$) and predictor variance ($\\sigma^2_{x}$)\u2014exert a profound influence on $R^2$. By analyzing the ratio of variances, we demonstrate that $R^2$ is a relative measure of power rather than an absolute measure of model accuracy.<\/p>\n\n\n\n<h2 class=\"wp-block-heading\">2. Mathematical Definition of $R^2$<\/h2>\n\n\n\n<p class=\"wp-block-paragraph\">To understand why variance dictates the behavior of $R^2$, we must first define it through the lens of Analysis of Variance (ANOVA). In a standard linear model $Y = \\beta_0 + \\beta_1 X + \\epsilon$, the total variation in the dependent variable $Y$ can be partitioned into two distinct components:<\/p>\n\n\n\n<ol class=\"wp-block-list\">\n<li><strong>Explained Variation (SS_{reg}):<\/strong> The variation accounted for by the relationship between $X$ and $Y$.<\/li>\n\n\n\n<li><strong>Unexplained Variation (SS_{res}):<\/strong> The variation resulting from the residuals or &#8220;noise&#8221; ($\\epsilon$).<\/li>\n<\/ol>\n\n\n\n<p class=\"wp-block-paragraph\">The fundamental identity is:<br>$$SS_{tot} = SS_{reg} + SS_{res}$$<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">From this, $R^2$ is defined as the proportion of the total variance in $Y$ that is explained by $X$:<br>$$R^2 = \\frac{SS_{reg}}{SS_{tot}} = 1 &#8211; \\frac{SS_{res}}{SS_{tot}}$$<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">Where:<\/p>\n\n\n\n<ul class=\"wp-block-list\">\n<li><strong>$SS_{res}$ (Residual Sum of Squares):<\/strong> $\\sum (y_i &#8211; \\hat{y}_i)^2$<\/li>\n\n\n\n<li><strong>$SS_{tot}$ (Total Sum of Squares):<\/strong> $\\sum (y_i &#8211; \\bar{y})^2$<\/li>\n<\/ul>\n\n\n\n<h2 class=\"wp-block-heading\">3. The Impact of Increased Error Variance (Noise)<\/h2>\n\n\n\n<p class=\"wp-block-paragraph\">The first scenario involves an increase in the variance of the residuals ($\\sigma^2_{\\epsilon}$), assuming the true relationship ($\\beta_1$) and the range of $X$ remain constant.<\/p>\n\n\n\n<h3 class=\"wp-block-heading\">3.1 The Mathematical Mechanism<\/h3>\n\n\n\n<p class=\"wp-block-paragraph\">As the noise in the data increases, each observed value $y_i$ deviates further from the regression line $\\hat{y}<em>i$. This directly inflates the $SS<\/em>{res}$ term. In the formula $R^2 = 1 &#8211; \\frac{SS_{res}}{SS_{tot}}$, as the numerator of the fraction increases, the entire fraction $\\frac{SS_{res}}{SS_{tot}}$ grows larger. Consequently, when this larger value is subtracted from 1, the resulting $R^2$ decreases.<\/p>\n\n\n\n<h3 class=\"wp-block-heading\">3.2 Conceptual Interpretation: The Signal-to-Noise Ratio<\/h3>\n\n\n\n<p class=\"wp-block-paragraph\">In the context of information theory and machine learning, we view the relationship between $X$ and $Y$ as the <strong>Signal<\/strong> and the residuals as the <strong>Noise<\/strong>. When the error variance increases, the noise overwhelms the signal. Even if the underlying model is &#8220;correct&#8221; (i.e., you have identified the true $\\beta_1$), the predictive power is diluted.<\/p>\n\n\n\n<blockquote class=\"wp-block-quote is-layout-flow wp-block-quote-is-layout-flow\">\n<p class=\"wp-block-paragraph\"><strong>Key Phrase:<\/strong> &#8220;Increased noise or residual variance diminishes the model&#8217;s explanatory power, leading to a lower $R^2$.&#8221;<\/p>\n<\/blockquote>\n\n\n\n<p class=\"wp-block-paragraph\">This illustrates that a low $R^2$ does not necessarily mean the model is &#8220;wrong&#8221;; it may simply mean the environment is inherently noisy, making the dependent variable difficult to predict with high precision.<\/p>\n\n\n\n<h2 class=\"wp-block-heading\">4. The Impact of Increased Predictor Variance (Range of X)<\/h2>\n\n\n\n<p class=\"wp-block-paragraph\">A more counterintuitive phenomenon occurs when we change the variance of the independent variable $X$. If we expand the range of $X$ values (thereby increasing $\\sigma^2_{x}$), the $R^2$ typically increases, even if the error variance $\\sigma^2_{\\epsilon}$ remains exactly the same.<\/p>\n\n\n\n<h3 class=\"wp-block-heading\">4.1 The Expansion of the Denominator<\/h3>\n\n\n\n<p class=\"wp-block-paragraph\">In a simple linear regression, the explained variance can be expressed as:<br>$$SS_{reg} = \\beta_1^2 \\cdot \\sum (x_i &#8211; \\bar{x})^2$$<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">When the variance of $X$ increases, $\\sum (x_i &#8211; \\bar{x})^2$ increases. This causes $SS_{reg}$ to grow. Since $SS_{tot} = SS_{reg} + SS_{res}$, and $SS_{res}$ is assumed constant, the denominator $SS_{tot}$ grows primarily because the &#8220;explained&#8221; part is growing.<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">In the fraction $\\frac{SS_{res}}{SS_{tot}}$, the denominator is getting larger while the numerator stays the same. This makes the fraction smaller, and $1 &#8211; (\\text{smaller number})$ results in a higher $R^2$.<\/p>\n\n\n\n<h3 class=\"wp-block-heading\">4.2 The Strength of the Trend<\/h3>\n\n\n\n<p class=\"wp-block-paragraph\">When we measure $X$ over a wider range, the overall &#8220;trend&#8221; or slope becomes more dominant relative to the local fluctuations (noise). The model captures a larger portion of the total spread of $Y$ because that spread is now driven more by the change in $X$ than by the random error.<\/p>\n\n\n\n<blockquote class=\"wp-block-quote is-layout-flow wp-block-quote-is-layout-flow\">\n<p class=\"wp-block-paragraph\"><strong>Key Phrase:<\/strong> &#8220;A wider range or higher variance in the independent variable often inflates the $R^2$, as the model captures a larger portion of the overall trend.&#8221;<\/p>\n<\/blockquote>\n\n\n\n<h2 class=\"wp-block-heading\">5. Summary Table: Variance vs. $R^2$<\/h2>\n\n\n\n<p class=\"wp-block-paragraph\">The following table summarizes the relationship between variance components and the resulting coefficient of determination.<\/p>\n\n\n\n<figure class=\"wp-block-table\"><table class=\"has-fixed-layout\"><thead><tr><th class=\"has-text-align-left\" data-align=\"left\">Scenario<\/th><th class=\"has-text-align-left\" data-align=\"left\">Effect on $R^2$<\/th><th class=\"has-text-align-left\" data-align=\"left\">Statistical Reason<\/th><\/tr><\/thead><tbody><tr><td class=\"has-text-align-left\" data-align=\"left\"><strong>Higher Residual Variance ($\\sigma^2_{\\epsilon}$)<\/strong><\/td><td class=\"has-text-align-left\" data-align=\"left\"><strong>Decreases<\/strong><\/td><td class=\"has-text-align-left\" data-align=\"left\">The &#8220;unexplained&#8221; portion ($SS_{res}$) of the data becomes a larger fraction of the total.<\/td><\/tr><tr><td class=\"has-text-align-left\" data-align=\"left\"><strong>Higher Predictor Variance ($\\sigma^2_{x}$)<\/strong><\/td><td class=\"has-text-align-left\" data-align=\"left\"><strong>Increases<\/strong><\/td><td class=\"has-text-align-left\" data-align=\"left\">The &#8220;explained&#8221; portion ($SS_{reg}$) grows, making the noise relatively less significant.<\/td><\/tr><tr><td class=\"has-text-align-left\" data-align=\"left\"><strong>Lower Total Variance ($SS_{tot}$)<\/strong><\/td><td class=\"has-text-align-left\" data-align=\"left\"><strong>Decreases<\/strong><\/td><td class=\"has-text-align-left\" data-align=\"left\">When the total spread of $Y$ is small, even minor errors lead to a low $R^2$.<\/td><\/tr><\/tbody><\/table><\/figure>\n\n\n\n<h2 class=\"wp-block-heading\">6. Practical Implications for AI\/ML Models<\/h2>\n\n\n\n<p class=\"wp-block-paragraph\">In machine learning, relying solely on $R^2$ can be misleading due to these variance dependencies.<\/p>\n\n\n\n<ol class=\"wp-block-list\">\n<li><strong>Model Comparison:<\/strong> One cannot easily compare the $R^2$ of a model trained on a narrow dataset with one trained on a diverse, wide-ranging dataset. The latter will likely have a higher $R^2$ simply due to the variance in $X$.<\/li>\n\n\n\n<li><strong>Overfitting Risks:<\/strong> High variance in $X$ can sometimes mask poor model performance in specific sub-regions of the data.<\/li>\n\n\n\n<li><strong>Feature Selection:<\/strong> When adding features, we are essentially trying to increase the explained variance ($SS_{reg}$) to reduce the relative weight of the residuals.<\/li>\n<\/ol>\n\n\n\n<h2 class=\"wp-block-heading\">7. Conclusion: $R^2$ as a Relative Metric<\/h2>\n\n\n\n<p class=\"wp-block-paragraph\">The reason why changes in variance affect $R^2$ is that $R^2$ is a <strong>ratio<\/strong>. It does not measure the absolute magnitude of the error (like MSE or MAE), but rather the error relative to the total spread of the data.<\/p>\n\n\n\n<ul class=\"wp-block-list\">\n<li>If the <strong>Unexplained Variance<\/strong> increases, the ratio of &#8220;error to total&#8221; rises, and $R^2$ falls.<\/li>\n\n\n\n<li>If the <strong>Explained Variance<\/strong> increases (via a wider range of $X$), the ratio of &#8220;error to total&#8221; falls, and $R^2$ rises.<\/li>\n<\/ul>\n\n\n\n<p class=\"wp-block-paragraph\">Understanding this dynamic is crucial for any data analyst or machine learning engineer. It prevents the common pitfall of dismissing a model with a low $R^2$ in a high-noise environment, or over-trusting a model with a high $R^2$ derived from an artificially wide range of independent variables.<\/p>\n\n\n\n<h2 class=\"wp-block-heading\">8. Key Terminology Reference<\/h2>\n\n\n\n<ul class=\"wp-block-list\">\n<li><strong>Coefficient of Determination ($R^2$):<\/strong> \uacb0\uc815\uacc4\uc218. The proportion of variance in the dependent variable that is predictable from the independent variable.<\/li>\n\n\n\n<li><strong>Explanatory Power:<\/strong> \uc124\uba85\ub825. The capacity of a model to represent the underlying patterns in the data.<\/li>\n\n\n\n<li><strong>Residual\/Error Variance:<\/strong> \uc794\ucc28\/\uc624\ucc28 \ubd84\uc0b0. The variance of the differences between observed and predicted values.<\/li>\n\n\n\n<li><strong>Signal-to-Noise Ratio (SNR):<\/strong> \uc2e0\ud638 \ub300 \uc7a1\uc74c\ube44. A measure that compares the level of a desired signal to the level of background noise.<\/li>\n<\/ul>\n\n\n\n<h2 class=\"wp-block-heading\">9. $R^2$ Variation with Sample Distributions Placed Along the 1\u2011to\u20111 Line<\/h2>\n\n\n\n<figure class=\"wp-block-image size-large is-resized\"><img decoding=\"async\" src=\"https:\/\/github.com\/ykim2718\/AIML\/blob\/main\/sigma_r2.png?raw=true\" alt=\"\" style=\"width:700px\"\/><\/figure>\n\n\n\n<p class=\"wp-block-paragraph\"><a href=\"https:\/\/github.com\/ykim2718\/AIML\/blob\/fa26858b340ee8e40912c105328f0acfbfbe1c27\/sigma_r2.py\" target=\"_blank\" rel=\"noreferrer noopener\">Python Code<\/a><\/p>\n\n\n\n<p class=\"wp-block-paragraph\"><\/p>\n<div style='text-align:center' class='yasr-auto-insert-overall'><\/div><div style='text-align:center' class='yasr-auto-insert-visitor'><\/div>","protected":false},"excerpt":{"rendered":"<p>1. Executive Summary The Coefficient of Determination, denoted as $R^2$, is one of the most widely used metrics for assessing the goodness-of-fit in linear regression models. However, its interpretation is often fraught with misunderstanding, particularly regarding how it fluctuates not just with the &#8220;correctness&#8221; of a model, but with the underlying distribution of the data&#8230;.<\/p>\n","protected":false},"author":4,"featured_media":6150,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"_bbp_topic_count":0,"_bbp_reply_count":0,"_bbp_total_topic_count":0,"_bbp_total_reply_count":0,"_bbp_voice_count":0,"_bbp_anonymous_reply_count":0,"_bbp_topic_count_hidden":0,"_bbp_reply_count_hidden":0,"_bbp_forum_subforum_count":0,"_kadence_starter_templates_imported_post":false,"_kad_post_transparent":"","_kad_post_title":"","_kad_post_layout":"","_kad_post_sidebar_id":"","_kad_post_content_style":"","_kad_post_vertical_padding":"","_kad_post_feature":"","_kad_post_feature_position":"","_kad_post_header":false,"_kad_post_footer":false,"_kad_post_classname":"","yasr_overall_rating":0,"yasr_post_is_review":"","yasr_auto_insert_disabled":"","yasr_review_type":"","fifu_image_url":"","fifu_image_alt":"","iawp_total_views":0,"footnotes":""},"categories":[56,369],"tags":[],"class_list":["post-6147","post","type-post","status-publish","format-standard","has-post-thumbnail","hentry","category-data-science-slug","category-evalutaion-metric-slug"],"yasr_visitor_votes":{"stars_attributes":{"read_only":false,"span_bottom":false},"number_of_votes":0,"sum_votes":0},"jetpack_featured_media_url":"https:\/\/ykim.synology.me\/wordpress\/wp-content\/uploads\/2026\/04\/the-coefficient-of-determination-R^2-300x200px.png","_links":{"self":[{"href":"https:\/\/ykim.synology.me\/wordpress\/wp-json\/wp\/v2\/posts\/6147","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/ykim.synology.me\/wordpress\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/ykim.synology.me\/wordpress\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/ykim.synology.me\/wordpress\/wp-json\/wp\/v2\/users\/4"}],"replies":[{"embeddable":true,"href":"https:\/\/ykim.synology.me\/wordpress\/wp-json\/wp\/v2\/comments?post=6147"}],"version-history":[{"count":8,"href":"https:\/\/ykim.synology.me\/wordpress\/wp-json\/wp\/v2\/posts\/6147\/revisions"}],"predecessor-version":[{"id":6176,"href":"https:\/\/ykim.synology.me\/wordpress\/wp-json\/wp\/v2\/posts\/6147\/revisions\/6176"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/ykim.synology.me\/wordpress\/wp-json\/wp\/v2\/media\/6150"}],"wp:attachment":[{"href":"https:\/\/ykim.synology.me\/wordpress\/wp-json\/wp\/v2\/media?parent=6147"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/ykim.synology.me\/wordpress\/wp-json\/wp\/v2\/categories?post=6147"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/ykim.synology.me\/wordpress\/wp-json\/wp\/v2\/tags?post=6147"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}