| | |

An Introductory Survey on Polynomial Machine Learning: Taxonomic Axes and Hierarchical Levels

 This report surveys Polynomial Machine Learning (PML) at an introductory level. PML refers to the family of techniques that exploit higher-order and interaction terms of input variables to learn nonlinear relationships. The discussion is organized along three taxonomic axes and arranged into six hierarchical levels. Deep mathematical or theoretical analysis is intentionally avoided; the goal is to provide a structured starting point for further study.

Taxonomic Axes and Hierarchical Levels

 The taxonomy uses three classification axes as a coordinate system, and the six levels are positions arranged along that system.

AxisMeaningLevels
Axis A: Mathematical FoundationOn which mathematical property (orthogonality, domain)
the polynomial is defined
Level 1, Level 2
Axis B: Model ArchitectureInto which computational structure (neural network, tensor decomposition)
the polynomial is embedded
Level 3, Level 4
Axis C: Application PatternHow the polynomial is combined with or
discovered from other models
Level 5, Level 6

Evolution Across Levels (Add / Subtract / Exchange)

 The following table shows what is added (+), removed (−), or exchanged (↔) when moving from one level to the next.

Transition(+) Added(−) Removed(↔) Exchanged
L1 → L2Orthogonality constraint, link to
probability distributions
Free use of arbitrary monomialsPower basis ↔ Orthogonal polynomial basis
L2 → L3Learnable weights, hierarchical layers,
optional activations
Closed-form coefficient estimationSingle regression equation ↔ Multilayer neural network
L3 → L4Tensor decomposition,
latent vector representation
Per-term explicit weightExplicit term weight ↔ Latent vector inner product
L4 → L5Coupling with non-polynomial models
(GP, CNN, physics)
Single-model assumptionPolynomial-only model ↔ Polynomial + residual hybrid
L5 → L6Discovery of the equation itself
from data
Pre-fixed model formHuman-prescribed form ↔ Data-discovered form

Taxonomy Hierarchy

Polynomial Machine Learning (PML)
│
[Axis A: Mathematical Foundation]
├── [Level 1] Classical Polynomial Models
│   ├── 1.1 Polynomial Regression
│   ├── 1.2 Polynomial Kernel Methods
│   └── 1.3 Response Surface Methodology (RSM)
│
├── [Level 2] Orthogonal Polynomial Basis Models
│   ├── Group 2-A: Theory-based Orthogonal Polynomials
│   │   ├── 2.1 Spatial-domain (Zernike / Chebyshev / Legendre / Fourier-Bessel)
│   │   ├── 2.2 Probabilistic-domain — Wiener-Askey (Hermite / Laguerre / Jacobi / Gegenbauer)
│   │   └── 2.3 Discrete-domain (Charlier / Krawtchouk / Meixner / Hahn)
│   ├── Group 2-B: Learning Frameworks Using Orthogonal Polynomials
│   │   ├── 2.4 Polynomial Chaos Expansion (PCE)
│   │   ├── 2.5 Sparse PCE & LARS-PCE
│   │   └── 2.6 Arbitrary PCE (aPCE)
│   └── Group 2-C: Data-driven Orthogonal Bases
│       └── 2.7 Karhunen-Loève (KL) Expansion / Proper Orthogonal Decomposition (POD)
│
[Axis B: Model Architecture]
├── [Level 3] Polynomial Neural Architectures
│   ├── 3.1 Group Method of Data Handling (GMDH)
│   ├── 3.2 Modern Polynomial Neural Networks (PNN)
│   ├── 3.3 Pi-Nets
│   ├── 3.4 Self-Organizing Polynomial NN (SOPNN)
│   └── 3.5 Kolmogorov-Arnold Networks (KAN)
│
├── [Level 4] Tensor & Factorization-based Polynomial Models
│   ├── 4.1 Factorization Machines (FM)
│   ├── 4.2 Higher-Order FM (HOFM)
│   ├── 4.3 Tensor Train / Tensor Regression
│   └── 4.4 Polynomial Tensor Decomposition
│
[Axis C: Application Pattern]
├── [Level 5] Hybrid & Surrogate Modeling
│   ├── 5.1 PCE-Kriging
│   ├── 5.2 Global + Residual Models (Spline / GNN / CNN)
│   ├── 5.3 Physics-Informed Polynomial Models
│   └── 5.4 Multi-fidelity Polynomial Surrogates
│
└── [Level 6] Symbolic & Sparse Polynomial Discovery
    ├── 6.1 Sparse Identification of Nonlinear Dynamics (SINDy)
    ├── 6.2 Symbolic Regression with Polynomial Basis
    └── 6.3 LASSO / Elastic-Net Polynomial Feature Selection

Level 1. Classical Polynomial Models [Axis A]

 The most fundamental layer, where linear models are extended directly using a power basis ($1, x, x^2, \ldots$). No orthogonality constraint applies.

1.1 Polynomial Regression

 A monomial-basis regression of the form $y = \beta_0 + \beta_1 x + \beta_2 x^2 + \cdots + \beta_d x^d + \epsilon$. As the degree increases, multicollinearity becomes severe, so it is standard practice to combine it with regularization such as Ridge or Least Absolute Shrinkage and Selection Operator (LASSO).

1.2 Polynomial Kernel Methods

 The kernel $K(x, y) = (x^T y + c)^d$ enables computation of inner products in a polynomial feature space without explicit high-dimensional mapping. It is used in Support Vector Machines (SVM), Kernel Ridge Regression, and Kernel Principal Component Analysis (Kernel PCA).

1.3 Response Surface Methodology (RSM)

 A second-order polynomial model $y = \beta_0 + \sum \beta_i x_i + \sum \beta_{ii} x_i^2 + \sum \beta_{ij} x_i x_j$ is used to find process optima. Combined with Central Composite Design (CCD) and Box-Behnken Design, it serves as a de facto standard for recipe optimization in etch, deposition, and Chemical Mechanical Planarization (CMP) processes (Myers et al. 2016).

Level 2. Orthogonal Polynomial Basis Models [Axis A]

 Level 2 introduces the constraint of orthogonality on top of Level 1 to gain stability and interpretability in coefficient estimation. Two polynomials $\phi_i, \phi_j$ are orthogonal under a weight function $w(x)$ if

$$\int_a^b \phi_i(x)\,\phi_j(x)\,w(x)\,dx = 0, \quad i \neq j$$

 Benefits gained by exploiting orthogonality include: (i) coefficients can be estimated independently via orthogonal projection; (ii) adding higher-order terms does not perturb existing coefficients; (iii) output variance decomposes additively, enabling direct sensitivity analysis such as Sobol indices (see Appendix A); and (iv) the regression matrix is well-conditioned. A general treatment of orthogonal polynomials is given in Appendix B.

 The seven sub-items are organized along two sub-axes: (A) the source of the basis functions (analytically defined vs. data-driven) and (B) the type of domain (spatial / probabilistic / discrete).

GroupDomainItems
2-A: Theory-basedSpatial2.1 Zernike / Chebyshev / Legendre / Fourier-Bessel
Probabilistic2.2 Wiener-Askey (Hermite / Laguerre / Jacobi / Gegenbauer)
Discrete lattice2.3 Charlier / Krawtchouk / Meixner / Hahn
2-B: Learning frameworksStochastic input → output regression2.4 PCE / 2.5 Sparse PCE / 2.6 aPCE
2-C: Data-driven basesMeasurement data2.7 KL Expansion / POD

2.1 Spatial-domain Orthogonal Polynomials

 These polynomials are orthogonal on a specific geometric domain such as a wafer or die. They are used to decompose measured spatial data into global patterns.

PolynomialDomainStrengthProcess Variation Use Case
ZernikeCircleAligns with optical aberration orthogonalityWafer-level Warp/Bow, global thickness and overlay decomposition
ChebyshevSquareSuppresses Runge phenomenon, minimax approximationScanner slit area, intra-die pattern variation
LegendreSquare / intervalSimple integration, center-weightedFlatness/roughness variation, linear trend separation
Fourier-BesselCircleStable at the edge, captures high-frequency contentWafer edge roll-off, post-CMP edge zone

2.2 Probabilistic-domain Orthogonal Polynomials (Wiener-Askey Mapping)

 When the input is a random variable, the orthogonal polynomial is selected so that its weight function matches the Probability Density Function (PDF) of that distribution. Xiu & Karniadakis (2002) extended the original Hermite-Gaussian pairing of PCE to the entire Askey scheme, providing a one-to-one mapping between distributions and orthogonal families. This report refers to that mapping as the Wiener-Askey mapping.

PolynomialInterval / WeightCorresponding Distribution
Hermite$(-\infty, \infty)$, $e^{-x^2/2}$Gaussian
Laguerre$[0, \infty)$, $e^{-x}$Gamma / Exponential
Jacobi$[-1, 1]$, $(1-x)^\alpha(1+x)^\beta$Beta
Gegenbauer$[-1, 1]$, $(1-x^2)^{\alpha-1/2}$Special case of Beta

2.3 Discrete-domain Orthogonal Polynomials

 These are families orthogonal on integer lattices, suitable for discrete inputs such as defect counts.

  • Charlier: Poisson distribution
  • Krawtchouk: Binomial distribution
  • Meixner: Negative Binomial distribution
  • Hahn: Hypergeometric distribution

2.4 Polynomial Chaos Expansion (PCE)

 PCE expands the response of a system with stochastic inputs as a series of orthogonal polynomials (Ghanem & Spanos 1991; Xiu & Karniadakis 2002).

$$Y = \sum_{\alpha \in \mathcal{A}} c_\alpha\, \Psi_\alpha(\xi)$$
SymbolMeaning
$Y$System output (scalar or vector), e.g., wafer thickness, critical dimension
$\xi = (\xi_1, \ldots, \xi_d)$Standardized stochastic input vector, each $\xi_i$ following a known distribution
$d$Number of stochastic input variables
$\alpha = (\alpha_1, \ldots, \alpha_d)$Multi-index, $\alpha_i \in \mathbb{N}_0$, indicating the polynomial degree per input
$\mathcal{A}$Set of multi-indices used (typically $\sum_i \alpha_i \leq p$)
$\Psi_\alpha(\xi)$Product of univariate orthogonal polynomials: $\prod_{i=1}^d \psi_{\alpha_i}(\xi_i)$
$c_\alpha$Polynomial coefficient (target of learning)
$p$Truncation order of the expansion

 Thanks to orthogonality, Sobol sensitivity indices (see Appendix A) can be obtained analytically from $c_\alpha$.

2.5 Sparse PCE & LARS-PCE

Why does the term count explode? The number of PCE terms with $d$-dimensional input and total degree up to $p$ is

$$P + 1 = \binom{d + p}{p} = \frac{(d+p)!}{d!\,p!}$$

 This counts all integer combinations whose degree-sum is at most $p$, leading to combinatorial explosion. For example: $d=10, p=4 \Rightarrow 1{,}001$ terms; $d=20, p=4 \Rightarrow 10{,}626$; $d=50, p=4 \Rightarrow 316{,}251$. In semiconductor processes with tens to hundreds of inputs, the number of terms quickly exceeds the number of available samples, making naive PCE infeasible.

 The remedy is sparsity. Least Angle Regression (LARS) or Orthogonal Matching Pursuit (OMP) selects only the most important polynomial terms. Adaptive Sparse PCE (Blatman & Sudret 2011) is a representative method.

2.6 Arbitrary PCE (aPCE)

 When the input distribution is unknown or non-standard, the orthogonal polynomial basis can be constructed directly from the empirical moments of the data (Oladyshkin & Nowak 2012). aPCE is useful for irregular process data where the Wiener-Askey mapping does not apply.

2.7 Data-driven Orthogonal Bases (KL Expansion / POD)

 Whereas 2.1–2.6 use polynomials defined a priori, 2.7 builds the basis directly from the data. The covariance structure of measurements is eigen-decomposed, and the eigenvectors corresponding to the largest eigenvalues serve as an orthogonal basis. Conceptually, the data itself reveals its own dominant variation modes (mode 1, mode 2, mode 3, …).

  • Karhunen-Loève (KL) Expansion: Optimal orthogonal decomposition of a random field; suitable for extracting principal modes of W2W variation (Loève 1978).
  • Proper Orthogonal Decomposition (POD): Discrete and practical version of KL; mathematically equivalent to Principal Component Analysis (PCA).

Level 3. Polynomial Neural Architectures [Axis B]

 Level 3 embeds polynomial combinations (higher-order and interaction terms) inside neuron computations.

3.1 Group Method of Data Handling (GMDH)

 The progenitor of Polynomial Neural Networks (PNN) (Ivakhnenko 1971). At each layer, second-order polynomial candidates over variable pairs $(x_i, x_j)$ are generated, and only nodes that pass an external validation criterion advance to the next layer, allowing the network to grow autonomously.

$$y = a_0 + a_1 x_i + a_2 x_j + a_3 x_i x_j + a_4 x_i^2 + a_5 x_j^2$$

3.2 Modern Polynomial Neural Networks (PNN)

 A modern extension of GMDH in which the degree, variable selection, and number of layers are determined adaptively.

3.3 Pi-Nets

 Pi-Nets (Chrysos et al. 2020) express the output as a higher-order polynomial expansion of the input and use tensor decompositions (CANDECOMP/PARAFAC, Tucker) to keep parameter counts tractable. They achieve strong expressive power even without activation functions.

3.4 Self-Organizing Polynomial Neural Networks (SOPNN)

 An extension of GMDH (Oh & Pedrycz 2002) that allows partial polynomials of varying degree at each node.

3.5 Kolmogorov-Arnold Networks (KAN)

 KAN (Liu et al. 2024) places learnable univariate functions (B-splines or polynomials) on the edges of the network. Choosing polynomial bases for the edge functions effectively yields a structured generalization of PNN.

Level 4. Tensor & Factorization-based Polynomial Models [Axis B]

 This level handles higher-order interaction terms efficiently through tensor decomposition.

4.1 Factorization Machines (FM)

 Factorization Machines (Rendle 2010) represent the interaction weights of a second-order polynomial regression as inner products of low-dimensional latent vectors rather than learning each weight independently.

$$\hat{y} = w_0 + \sum_i w_i x_i + \sum_{i

 FM is particularly effective on sparse data such as recommendation systems and click-through-rate prediction.

4.2 Higher-Order Factorization Machines (HOFM)

 HOFM (Blondel et al. 2016) extends FM to third- and fourth-order interactions efficiently using ANOVA kernels.

4.3 Tensor Train / Tensor Regression

 The coefficients of a multivariate polynomial are viewed as a tensor and compressed via Tensor Train (TT) decomposition or Tucker decomposition. Low-rank PCE applied to high-dimensional PCE belongs to this category (Konakli & Sudret 2016).

4.4 Polynomial Tensor Decomposition

 A formulation that casts tensor decomposition itself in polynomial form.

Level 5. Hybrid & Surrogate Modeling [Axis C]

 Level 5 combines different polynomial techniques, or polynomial models with non-polynomial models, to maximize expressive power.

5.1 PCE-Kriging

 PC-Kriging (Schöbi et al. 2015) captures global trends with PCE and models the residual as a Gaussian Process (GP). It is a standard paradigm in virtual metrology.

5.2 Global + Residual Models

 Low-order orthogonal polynomials (e.g., Zernike) capture global shape, while a Spline (notably the Thin Plate Spline), Graph Neural Network (GNN), or Convolutional Neural Network (CNN) learns the fine-scale residual. This hybrid is effective for local distortions such as those caused by chuck adsorption.

5.3 Physics-Informed Polynomial Models

 A polynomial variant of Physics-Informed Neural Networks (PINN) (Raissi et al. 2019). Governing equations are included in the loss function, and the solution is expanded in a polynomial basis (typically Chebyshev or Legendre); this construction is referred to as a Spectral PINN.

5.4 Multi-fidelity Polynomial Surrogates

 Combines low-fidelity (fast simulation) and high-fidelity (measurement) data (Kennedy & O’Hagan 2000). It is essential for virtual metrology that fuses Technology Computer-Aided Design (TCAD) simulations with measurements.

Level 6. Symbolic & Sparse Polynomial Discovery [Axis C]

 The most recent direction: discovering interpretable polynomial expressions directly from data.

6.1 Sparse Identification of Nonlinear Dynamics (SINDy)

 SINDy (Brunton et al. 2016) builds a library matrix from candidate functions (polynomials, trigonometric terms, etc.) and applies LASSO to recover a sparse solution, thereby identifying the governing equations of a dynamical system.

6.2 Symbolic Regression with Polynomial Basis

 Tools based on Genetic Programming (GP), such as PySR (Cranmer 2023), use polynomial terms as building blocks and search over expressions.

6.3 LASSO / Elastic-Net Polynomial Feature Selection

 A classical approach: a large number of polynomial features are generated and then pruned via regularization (Tibshirani 1996; Zou & Hastie 2005).

Mapping to Wafer Process Variation Modeling (WiW / W2W)

ApplicationTaxonomy PositionMethod
WiW spatial decomposition (global shape)Level 2.1Zernike / Chebyshev / Legendre / Fourier-Bessel regression
WiW spatial decomposition (local residual)Level 5.2Global + Spline/GNN/CNN residual
W2W variation mode extractionLevel 2.7KL Expansion / POD
Process variation Uncertainty Quantification (UQ)Level 2.4, 2.5PCE / Sparse PCE
Virtual MetrologyLevel 5.1, 5.4PCE-Kriging, Multi-fidelity surrogate
Process recipe optimization (DOE)Level 1.3RSM
Process dynamics discoveryLevel 6.1SINDy

Appendix A. Sobol Sensitivity Indices

 Sobol indices (Sobol 1993) are a global-sensitivity measure that quantifies how much of the output variance is attributable to each input variable, or to combinations of inputs.

A.1 ANOVA Decomposition

 Suppose $Y = f(X_1, X_2, \ldots, X_d)$ admits the Analysis of Variance (ANOVA) decomposition

$$f(X) = f_0 + \sum_i f_i(X_i) + \sum_{i \lt j} f_{ij}(X_i, X_j) + \cdots + f_{1,2,\ldots,d}(X_1, \ldots, X_d)$$

 If the components are mutually orthogonal (independent), the output variance decomposes as $\mathrm{Var}(Y) = \sum_i V_i + \sum_{i \lt j} V_{ij} + \cdots$, where $V_i = \mathrm{Var}(f_i(X_i))$ is the contribution of $X_i$ alone and $V_{ij}$ captures the interaction between $X_i$ and $X_j$.

A.2 First-order and Total-effect Indices

$$S_i = \frac{V_i}{\mathrm{Var}(Y)}, \qquad S_i^T = 1 – \frac{\mathrm{Var}(\mathbb{E}[Y \mid X_{\sim i}])}{\mathrm{Var}(Y)}$$

 $S_i$ is the contribution of $X_i$ acting alone, while the total-effect index $S_i^T$ is the total contribution of $X_i$ including all its interactions.

A.3 Closed-form Computation in PCE

 In a PCE $Y = \sum_\alpha c_\alpha \Psi_\alpha(\xi)$, orthogonality makes the variance a simple sum:

$$\mathrm{Var}(Y) = \sum_{\alpha \neq 0} c_\alpha^2\, \mathbb{E}[\Psi_\alpha^2]$$

 Letting $\mathcal{A}_i = \{\alpha : \alpha_i > 0,\ \alpha_j = 0\ \forall j \neq i\}$, we have

$$S_i = \frac{\sum_{\alpha \in \mathcal{A}_i} c_\alpha^2\, \mathbb{E}[\Psi_\alpha^2]}{\sum_{\alpha \neq 0} c_\alpha^2\, \mathbb{E}[\Psi_\alpha^2]}$$

 All Sobol indices follow in closed form from the PCE coefficients alone, with no additional simulation required. This property (Sudret 2008) is the central reason PCE has become a standard in UQ.

Appendix B. Orthogonal Polynomials

B.1 Definition

 Given an interval $[a, b]$ and a non-negative weight $w(x)$, a sequence $\{\phi_0, \phi_1, \phi_2, \ldots\}$ of polynomials is called an orthogonal polynomial sequence with respect to $w(x)$ if

$$\langle \phi_i, \phi_j \rangle_w := \int_a^b \phi_i(x)\,\phi_j(x)\,w(x)\,dx = h_i\,\delta_{ij}$$

 where $h_i > 0$ are normalization constants and $\delta_{ij}$ is the Kronecker delta. If $h_i = 1$, the system is orthonormal.

B.2 Key Properties

(1) Three-term recurrence. Every orthogonal polynomial system satisfies a recurrence of the form

$$\phi_{n+1}(x) = (A_n x + B_n)\,\phi_n(x) – C_n\,\phi_{n-1}(x)$$

 which makes evaluation numerically stable and efficient.

(2) Distribution of zeros. $\phi_n(x)$ has exactly $n$ simple real zeros inside $[a, b]$, which serve as the nodes of Gaussian quadrature.

(3) Best approximation. In the $L^2_w$ norm, the best polynomial approximation of $f$ of degree at most $n$ is

$$f_n^*(x) = \sum_{k=0}^n \frac{\langle f, \phi_k \rangle_w}{h_k}\,\phi_k(x)$$

 each coefficient determined independently of the others — the orthogonal projection.

B.3 Representative Polynomials

NameInterval $[a,b]$Weight $w(x)$Initial Recurrence
Legendre $P_n$$[-1, 1]$$1$$P_0=1, P_1=x$
Chebyshev (1st kind) $T_n$$[-1, 1]$$(1-x^2)^{-1/2}$$T_0=1, T_1=x$
Hermite $H_n$ (probabilists’)$(-\infty, \infty)$$e^{-x^2/2}$$H_0=1, H_1=x$
Laguerre $L_n$$[0, \infty)$$e^{-x}$$L_0=1, L_1=1-x$
Jacobi $P_n^{(\alpha,\beta)}$$[-1, 1]$$(1-x)^\alpha(1+x)^\beta$$P_0=1$
Zernike $Z_n^m(r,\theta)$Unit disk$1$2D, separable in radius and angle

B.4 Wiener-Askey Mapping (Distribution ↔ Orthogonal Polynomial)

DistributionWeight = PDFOrthogonal Polynomial
Gaussian$\propto e^{-x^2/2}$Hermite
Uniform on $[-1,1]$$1/2$Legendre
Gamma / Exponential$\propto e^{-x}$Laguerre
Beta$\propto (1-x)^\alpha(1+x)^\beta$Jacobi
PoissondiscreteCharlier
BinomialdiscreteKrawtchouk

 This mapping (Xiu & Karniadakis 2002) is the basis on which PCE automatically chooses the orthogonal polynomial family that matches the input distribution.

B.5 Why Orthogonal Polynomials Are Powerful in PML

  • Numerical stability: The monomial basis $\{1, x, x^2, \ldots\}$ becomes nearly parallel on $[0,1]$, producing an ill-conditioned Vandermonde regression matrix. Orthogonal bases avoid this.
  • Coefficient interpretability: Each coefficient directly represents the strength of its corresponding polynomial mode.
  • Modularity in adding terms: Adding higher-order terms does not alter existing coefficients, enabling adaptive modeling.
  • Variance decomposition: ANOVA-style decomposition follows automatically — Sobol indices and sensitivity analysis become direct.
  • Compatibility with Gaussian quadrature: The zeros of the polynomials serve as quadrature nodes, making integrals and expectations efficient.

References

  • Blatman, G., & Sudret, B. (2011). Adaptive sparse polynomial chaos expansion based on least angle regression. Journal of Computational Physics, 230(6), 2345–2367.
  • Blondel, M., Fujino, A., Ueda, N., & Ishihata, M. (2016). Higher-order factorization machines. Advances in Neural Information Processing Systems, 29.
  • Brunton, S. L., Proctor, J. L., & Kutz, J. N. (2016). Discovering governing equations from data by sparse identification of nonlinear dynamical systems. Proceedings of the National Academy of Sciences, 113(15), 3932–3937.
  • Chrysos, G. G., Moschoglou, S., Bouritsas, G., Panagakis, Y., Deng, J., & Zafeiriou, S. (2020). P-Nets: Deep polynomial neural networks. IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).
  • Cranmer, M. (2023). Interpretable machine learning for science with PySR and SymbolicRegression.jl. arXiv preprint arXiv:2305.01582.
  • Ghanem, R. G., & Spanos, P. D. (1991). Stochastic Finite Elements: A Spectral Approach. Springer.
  • Ivakhnenko, A. G. (1971). Polynomial theory of complex systems. IEEE Transactions on Systems, Man, and Cybernetics, SMC-1(4), 364–378.
  • Kennedy, M. C., & O’Hagan, A. (2000). Predicting the output from a complex computer code when fast approximations are available. Biometrika, 87(1), 1–13.
  • Konakli, K., & Sudret, B. (2016). Polynomial meta-models with canonical low-rank approximations: Numerical insights and comparison to sparse polynomial chaos expansions. Journal of Computational Physics, 321, 1144–1169.
  • Liu, Z., Wang, Y., Vaidya, S., Ruehle, F., Halverson, J., Soljačić, M., Hou, T. Y., & Tegmark, M. (2024). KAN: Kolmogorov-Arnold networks. arXiv preprint arXiv:2404.19756.
  • Loève, M. (1978). Probability Theory II (4th ed.). Springer.
  • Myers, R. H., Montgomery, D. C., & Anderson-Cook, C. M. (2016). Response Surface Methodology: Process and Product Optimization Using Designed Experiments (4th ed.). Wiley.
  • Oh, S. K., & Pedrycz, W. (2002). Self-organizing polynomial neural networks based on polynomial and fuzzy polynomial neurons: Analysis and design. Fuzzy Sets and Systems, 142(2), 163–198.
  • Oladyshkin, S., & Nowak, W. (2012). Data-driven uncertainty quantification using the arbitrary polynomial chaos expansion. Reliability Engineering & System Safety, 106, 179–190.
  • Raissi, M., Perdikaris, P., & Karniadakis, G. E. (2019). Physics-informed neural networks: A deep learning framework for solving forward and inverse problems involving nonlinear partial differential equations. Journal of Computational Physics, 378, 686–707.
  • Rendle, S. (2010). Factorization machines. IEEE International Conference on Data Mining, 995–1000.
  • Schöbi, R., Sudret, B., & Wiart, J. (2015). Polynomial-chaos-based Kriging. International Journal for Uncertainty Quantification, 5(2), 171–193.
  • Sobol, I. M. (1993). Sensitivity estimates for nonlinear mathematical models. Mathematical Modelling and Computational Experiments, 1(4), 407–414.
  • Sudret, B. (2008). Global sensitivity analysis using polynomial chaos expansions. Reliability Engineering & System Safety, 93(7), 964–979.
  • Tibshirani, R. (1996). Regression shrinkage and selection via the lasso. Journal of the Royal Statistical Society: Series B, 58(1), 267–288.
  • Xiu, D., & Karniadakis, G. E. (2002). The Wiener-Askey polynomial chaos for stochastic differential equations. SIAM Journal on Scientific Computing, 24(2), 619–644.
  • Zou, H., & Hastie, T. (2005). Regularization and variable selection via the elastic net. Journal of the Royal Statistical Society: Series B, 67(2), 301–320.
Our Score
Click to rate this post!
[Total: 1 Average: 2]
Visited 5 times, 1 visit(s) today

Leave a Comment

Your email address will not be published. Required fields are marked *