This report surveys Polynomial Machine Learning (PML) at an introductory level. PML refers to the family of techniques that exploit higher-order and interaction terms of input variables to learn nonlinear relationships. The discussion is organized along three taxonomic axes and arranged into six hierarchical levels. Deep mathematical or theoretical analysis is intentionally avoided; the goal is to provide a structured starting point for further study.

Taxonomic Axes and Hierarchical Levels

The taxonomy uses three classification axes as a coordinate system, and the six levels are positions arranged along that system.

Axis	Meaning	Levels
Axis A: Mathematical Foundation	On which mathematical property (orthogonality, domain) the polynomial is defined	Level 1, Level 2
Axis B: Model Architecture	Into which computational structure (neural network, tensor decomposition) the polynomial is embedded	Level 3, Level 4
Axis C: Application Pattern	How the polynomial is combined with or discovered from other models	Level 5, Level 6

Evolution Across Levels (Add / Subtract / Exchange)

The following table shows what is added (+), removed (−), or exchanged (↔) when moving from one level to the next.

Transition	(+) Added	(−) Removed	(↔) Exchanged
L1 → L2	Orthogonality constraint, link to probability distributions	Free use of arbitrary monomials	Power basis ↔ Orthogonal polynomial basis
L2 → L3	Learnable weights, hierarchical layers, optional activations	Closed-form coefficient estimation	Single regression equation ↔ Multilayer neural network
L3 → L4	Tensor decomposition, latent vector representation	Per-term explicit weight	Explicit term weight ↔ Latent vector inner product
L4 → L5	Coupling with non-polynomial models (GP, CNN, physics)	Single-model assumption	Polynomial-only model ↔ Polynomial + residual hybrid
L5 → L6	Discovery of the equation itself from data	Pre-fixed model form	Human-prescribed form ↔ Data-discovered form

Taxonomy Hierarchy

Polynomial Machine Learning (PML)
│
[Axis A: Mathematical Foundation]
├── [Level 1] Classical Polynomial Models
│   ├── 1.1 Polynomial Regression
│   ├── 1.2 Polynomial Kernel Methods
│   └── 1.3 Response Surface Methodology (RSM)
│
├── [Level 2] Orthogonal Polynomial Basis Models
│   ├── Group 2-A: Theory-based Orthogonal Polynomials
│   │   ├── 2.1 Spatial-domain (Zernike / Chebyshev / Legendre / Fourier-Bessel)
│   │   ├── 2.2 Probabilistic-domain — Wiener-Askey (Hermite / Laguerre / Jacobi / Gegenbauer)
│   │   └── 2.3 Discrete-domain (Charlier / Krawtchouk / Meixner / Hahn)
│   ├── Group 2-B: Learning Frameworks Using Orthogonal Polynomials
│   │   ├── 2.4 Polynomial Chaos Expansion (PCE)
│   │   ├── 2.5 Sparse PCE & LARS-PCE
│   │   └── 2.6 Arbitrary PCE (aPCE)
│   └── Group 2-C: Data-driven Orthogonal Bases
│       └── 2.7 Karhunen-Loève (KL) Expansion / Proper Orthogonal Decomposition (POD)
│
[Axis B: Model Architecture]
├── [Level 3] Polynomial Neural Architectures
│   ├── 3.1 Group Method of Data Handling (GMDH)
│   ├── 3.2 Modern Polynomial Neural Networks (PNN)
│   ├── 3.3 Pi-Nets
│   ├── 3.4 Self-Organizing Polynomial NN (SOPNN)
│   └── 3.5 Kolmogorov-Arnold Networks (KAN)
│
├── [Level 4] Tensor & Factorization-based Polynomial Models
│   ├── 4.1 Factorization Machines (FM)
│   ├── 4.2 Higher-Order FM (HOFM)
│   ├── 4.3 Tensor Train / Tensor Regression
│   └── 4.4 Polynomial Tensor Decomposition
│
[Axis C: Application Pattern]
├── [Level 5] Hybrid & Surrogate Modeling
│   ├── 5.1 PCE-Kriging
│   ├── 5.2 Global + Residual Models (Spline / GNN / CNN)
│   ├── 5.3 Physics-Informed Polynomial Models
│   └── 5.4 Multi-fidelity Polynomial Surrogates
│
└── [Level 6] Symbolic & Sparse Polynomial Discovery
    ├── 6.1 Sparse Identification of Nonlinear Dynamics (SINDy)
    ├── 6.2 Symbolic Regression with Polynomial Basis
    └── 6.3 LASSO / Elastic-Net Polynomial Feature Selection

Level 1. Classical Polynomial Models [Axis A]

The most fundamental layer, where linear models are extended directly using a power basis ($1, x, x^2, \ldots$). No orthogonality constraint applies.

1.1 Polynomial Regression

A monomial-basis regression of the form $y = \beta_0 + \beta_1 x + \beta_2 x^2 + \cdots + \beta_d x^d + \epsilon$. As the degree increases, multicollinearity becomes severe, so it is standard practice to combine it with regularization such as Ridge or Least Absolute Shrinkage and Selection Operator (LASSO).

1.2 Polynomial Kernel Methods

The kernel $K(x, y) = (x^T y + c)^d$ enables computation of inner products in a polynomial feature space without explicit high-dimensional mapping. It is used in Support Vector Machines (SVM), Kernel Ridge Regression, and Kernel Principal Component Analysis (Kernel PCA).

1.3 Response Surface Methodology (RSM)

A second-order polynomial model $y = \beta_0 + \sum \beta_i x_i + \sum \beta_{ii} x_i^2 + \sum \beta_{ij} x_i x_j$ is used to find process optima. Combined with Central Composite Design (CCD) and Box-Behnken Design, it serves as a de facto standard for recipe optimization in etch, deposition, and Chemical Mechanical Planarization (CMP) processes (Myers et al. 2016).

Level 2. Orthogonal Polynomial Basis Models [Axis A]

Level 2 introduces the constraint of orthogonality on top of Level 1 to gain stability and interpretability in coefficient estimation. Two polynomials $\phi_i, \phi_j$ are orthogonal under a weight function $w(x)$ if

$$\int_a^b \phi_i(x)\,\phi_j(x)\,w(x)\,dx = 0, \quad i \neq j$$

Benefits gained by exploiting orthogonality include: (i) coefficients can be estimated independently via orthogonal projection; (ii) adding higher-order terms does not perturb existing coefficients; (iii) output variance decomposes additively, enabling direct sensitivity analysis such as Sobol indices (see Appendix A); and (iv) the regression matrix is well-conditioned. A general treatment of orthogonal polynomials is given in Appendix B.

The seven sub-items are organized along two sub-axes: (A) the source of the basis functions (analytically defined vs. data-driven) and (B) the type of domain (spatial / probabilistic / discrete).

Group	Domain	Items
2-A: Theory-based	Spatial	2.1 Zernike / Chebyshev / Legendre / Fourier-Bessel
	Probabilistic	2.2 Wiener-Askey (Hermite / Laguerre / Jacobi / Gegenbauer)
	Discrete lattice	2.3 Charlier / Krawtchouk / Meixner / Hahn
2-B: Learning frameworks	Stochastic input → output regression	2.4 PCE / 2.5 Sparse PCE / 2.6 aPCE
2-C: Data-driven bases	Measurement data	2.7 KL Expansion / POD

2.1 Spatial-domain Orthogonal Polynomials

These polynomials are orthogonal on a specific geometric domain such as a wafer or die. They are used to decompose measured spatial data into global patterns.

Polynomial	Domain	Strength	Process Variation Use Case
Zernike	Circle	Aligns with optical aberration orthogonality	Wafer-level Warp/Bow, global thickness and overlay decomposition
Chebyshev	Square	Suppresses Runge phenomenon, minimax approximation	Scanner slit area, intra-die pattern variation
Legendre	Square / interval	Simple integration, center-weighted	Flatness/roughness variation, linear trend separation
Fourier-Bessel	Circle	Stable at the edge, captures high-frequency content	Wafer edge roll-off, post-CMP edge zone

2.2 Probabilistic-domain Orthogonal Polynomials (Wiener-Askey Mapping)

When the input is a random variable, the orthogonal polynomial is selected so that its weight function matches the Probability Density Function (PDF) of that distribution. Xiu & Karniadakis (2002) extended the original Hermite-Gaussian pairing of PCE to the entire Askey scheme, providing a one-to-one mapping between distributions and orthogonal families. This report refers to that mapping as the Wiener-Askey mapping.

Polynomial	Interval / Weight	Corresponding Distribution
Hermite	$(-\infty, \infty)$, $e^{-x^2/2}$	Gaussian
Laguerre	$[0, \infty)$, $e^{-x}$	Gamma / Exponential
Jacobi	$[-1, 1]$, $(1-x)^\alpha(1+x)^\beta$	Beta
Gegenbauer	$[-1, 1]$, $(1-x^2)^{\alpha-1/2}$	Special case of Beta

2.3 Discrete-domain Orthogonal Polynomials

These are families orthogonal on integer lattices, suitable for discrete inputs such as defect counts.

Charlier: Poisson distribution
Krawtchouk: Binomial distribution
Meixner: Negative Binomial distribution
Hahn: Hypergeometric distribution

2.4 Polynomial Chaos Expansion (PCE)

PCE expands the response of a system with stochastic inputs as a series of orthogonal polynomials (Ghanem & Spanos 1991; Xiu & Karniadakis 2002).

$$Y = \sum_{\alpha \in \mathcal{A}} c_\alpha\, \Psi_\alpha(\xi)$$

Symbol	Meaning
$Y$	System output (scalar or vector), e.g., wafer thickness, critical dimension
$\xi = (\xi_1, \ldots, \xi_d)$	Standardized stochastic input vector, each $\xi_i$ following a known distribution
$d$	Number of stochastic input variables
$\alpha = (\alpha_1, \ldots, \alpha_d)$	Multi-index, $\alpha_i \in \mathbb{N}_0$, indicating the polynomial degree per input
$\mathcal{A}$	Set of multi-indices used (typically $\sum_i \alpha_i \leq p$)
$\Psi_\alpha(\xi)$	Product of univariate orthogonal polynomials: $\prod_{i=1}^d \psi_{\alpha_i}(\xi_i)$
$c_\alpha$	Polynomial coefficient (target of learning)
$p$	Truncation order of the expansion

Thanks to orthogonality, Sobol sensitivity indices (see Appendix A) can be obtained analytically from $c_\alpha$.

2.5 Sparse PCE & LARS-PCE

Why does the term count explode? The number of PCE terms with $d$-dimensional input and total degree up to $p$ is

$$P + 1 = \binom{d + p}{p} = \frac{(d+p)!}{d!\,p!}$$

This counts all integer combinations whose degree-sum is at most $p$, leading to combinatorial explosion. For example: $d=10, p=4 \Rightarrow 1{,}001$ terms; $d=20, p=4 \Rightarrow 10{,}626$; $d=50, p=4 \Rightarrow 316{,}251$. In semiconductor processes with tens to hundreds of inputs, the number of terms quickly exceeds the number of available samples, making naive PCE infeasible.

The remedy is sparsity. Least Angle Regression (LARS) or Orthogonal Matching Pursuit (OMP) selects only the most important polynomial terms. Adaptive Sparse PCE (Blatman & Sudret 2011) is a representative method.

2.6 Arbitrary PCE (aPCE)

When the input distribution is unknown or non-standard, the orthogonal polynomial basis can be constructed directly from the empirical moments of the data (Oladyshkin & Nowak 2012). aPCE is useful for irregular process data where the Wiener-Askey mapping does not apply.

2.7 Data-driven Orthogonal Bases (KL Expansion / POD)

Whereas 2.1–2.6 use polynomials defined a priori, 2.7 builds the basis directly from the data. The covariance structure of measurements is eigen-decomposed, and the eigenvectors corresponding to the largest eigenvalues serve as an orthogonal basis. Conceptually, the data itself reveals its own dominant variation modes (mode 1, mode 2, mode 3, …).

Karhunen-Loève (KL) Expansion: Optimal orthogonal decomposition of a random field; suitable for extracting principal modes of W2W variation (Loève 1978).
Proper Orthogonal Decomposition (POD): Discrete and practical version of KL; mathematically equivalent to Principal Component Analysis (PCA).

Level 3. Polynomial Neural Architectures [Axis B]

Level 3 embeds polynomial combinations (higher-order and interaction terms) inside neuron computations.

3.1 Group Method of Data Handling (GMDH)

The progenitor of Polynomial Neural Networks (PNN) (Ivakhnenko 1971). At each layer, second-order polynomial candidates over variable pairs $(x_i, x_j)$ are generated, and only nodes that pass an external validation criterion advance to the next layer, allowing the network to grow autonomously.

$$y = a_0 + a_1 x_i + a_2 x_j + a_3 x_i x_j + a_4 x_i^2 + a_5 x_j^2$$

3.2 Modern Polynomial Neural Networks (PNN)

A modern extension of GMDH in which the degree, variable selection, and number of layers are determined adaptively.

3.3 Pi-Nets

Pi-Nets (Chrysos et al. 2020) express the output as a higher-order polynomial expansion of the input and use tensor decompositions (CANDECOMP/PARAFAC, Tucker) to keep parameter counts tractable. They achieve strong expressive power even without activation functions.

3.4 Self-Organizing Polynomial Neural Networks (SOPNN)

An extension of GMDH (Oh & Pedrycz 2002) that allows partial polynomials of varying degree at each node.

3.5 Kolmogorov-Arnold Networks (KAN)

KAN (Liu et al. 2024) places learnable univariate functions (B-splines or polynomials) on the edges of the network. Choosing polynomial bases for the edge functions effectively yields a structured generalization of PNN.

Level 4. Tensor & Factorization-based Polynomial Models [Axis B]

This level handles higher-order interaction terms efficiently through tensor decomposition.

4.1 Factorization Machines (FM)

Factorization Machines (Rendle 2010) represent the interaction weights of a second-order polynomial regression as inner products of low-dimensional latent vectors rather than learning each weight independently.

$$\hat{y} = w_0 + \sum_i w_i x_i + \sum_{i

FM is particularly effective on sparse data such as recommendation systems and click-through-rate prediction.

4.2 Higher-Order Factorization Machines (HOFM)

HOFM (Blondel et al. 2016) extends FM to third- and fourth-order interactions efficiently using ANOVA kernels.

4.3 Tensor Train / Tensor Regression

The coefficients of a multivariate polynomial are viewed as a tensor and compressed via Tensor Train (TT) decomposition or Tucker decomposition. Low-rank PCE applied to high-dimensional PCE belongs to this category (Konakli & Sudret 2016).

4.4 Polynomial Tensor Decomposition

A formulation that casts tensor decomposition itself in polynomial form.

Level 5. Hybrid & Surrogate Modeling [Axis C]

Level 5 combines different polynomial techniques, or polynomial models with non-polynomial models, to maximize expressive power.

5.1 PCE-Kriging

PC-Kriging (Schöbi et al. 2015) captures global trends with PCE and models the residual as a Gaussian Process (GP). It is a standard paradigm in virtual metrology.

5.2 Global + Residual Models

Low-order orthogonal polynomials (e.g., Zernike) capture global shape, while a Spline (notably the Thin Plate Spline), Graph Neural Network (GNN), or Convolutional Neural Network (CNN) learns the fine-scale residual. This hybrid is effective for local distortions such as those caused by chuck adsorption.

5.3 Physics-Informed Polynomial Models

A polynomial variant of Physics-Informed Neural Networks (PINN) (Raissi et al. 2019). Governing equations are included in the loss function, and the solution is expanded in a polynomial basis (typically Chebyshev or Legendre); this construction is referred to as a Spectral PINN.

5.4 Multi-fidelity Polynomial Surrogates

Combines low-fidelity (fast simulation) and high-fidelity (measurement) data (Kennedy & O’Hagan 2000). It is essential for virtual metrology that fuses Technology Computer-Aided Design (TCAD) simulations with measurements.

Level 6. Symbolic & Sparse Polynomial Discovery [Axis C]

The most recent direction: discovering interpretable polynomial expressions directly from data.

6.1 Sparse Identification of Nonlinear Dynamics (SINDy)

SINDy (Brunton et al. 2016) builds a library matrix from candidate functions (polynomials, trigonometric terms, etc.) and applies LASSO to recover a sparse solution, thereby identifying the governing equations of a dynamical system.

6.2 Symbolic Regression with Polynomial Basis

Tools based on Genetic Programming (GP), such as PySR (Cranmer 2023), use polynomial terms as building blocks and search over expressions.

6.3 LASSO / Elastic-Net Polynomial Feature Selection

A classical approach: a large number of polynomial features are generated and then pruned via regularization (Tibshirani 1996; Zou & Hastie 2005).

Mapping to Wafer Process Variation Modeling (WiW / W2W)

Application	Taxonomy Position	Method
WiW spatial decomposition (global shape)	Level 2.1	Zernike / Chebyshev / Legendre / Fourier-Bessel regression
WiW spatial decomposition (local residual)	Level 5.2	Global + Spline/GNN/CNN residual
W2W variation mode extraction	Level 2.7	KL Expansion / POD
Process variation Uncertainty Quantification (UQ)	Level 2.4, 2.5	PCE / Sparse PCE
Virtual Metrology	Level 5.1, 5.4	PCE-Kriging, Multi-fidelity surrogate
Process recipe optimization (DOE)	Level 1.3	RSM
Process dynamics discovery	Level 6.1	SINDy

Appendix A. Sobol Sensitivity Indices

Sobol indices (Sobol 1993) are a global-sensitivity measure that quantifies how much of the output variance is attributable to each input variable, or to combinations of inputs.

A.1 ANOVA Decomposition

Suppose $Y = f(X_1, X_2, \ldots, X_d)$ admits the Analysis of Variance (ANOVA) decomposition

$$f(X) = f_0 + \sum_i f_i(X_i) + \sum_{i \lt j} f_{ij}(X_i, X_j) + \cdots + f_{1,2,\ldots,d}(X_1, \ldots, X_d)$$

If the components are mutually orthogonal (independent), the output variance decomposes as $\mathrm{Var}(Y) = \sum_i V_i + \sum_{i \lt j} V_{ij} + \cdots$, where $V_i = \mathrm{Var}(f_i(X_i))$ is the contribution of $X_i$ alone and $V_{ij}$ captures the interaction between $X_i$ and $X_j$.

A.2 First-order and Total-effect Indices

$$S_i = \frac{V_i}{\mathrm{Var}(Y)}, \qquad S_i^T = 1 – \frac{\mathrm{Var}(\mathbb{E}[Y \mid X_{\sim i}])}{\mathrm{Var}(Y)}$$

$S_i$ is the contribution of $X_i$ acting alone, while the total-effect index $S_i^T$ is the total contribution of $X_i$ including all its interactions.

A.3 Closed-form Computation in PCE

In a PCE $Y = \sum_\alpha c_\alpha \Psi_\alpha(\xi)$, orthogonality makes the variance a simple sum:

$$\mathrm{Var}(Y) = \sum_{\alpha \neq 0} c_\alpha^2\, \mathbb{E}[\Psi_\alpha^2]$$

Letting $\mathcal{A}_i = \{\alpha : \alpha_i > 0,\ \alpha_j = 0\ \forall j \neq i\}$, we have

$$S_i = \frac{\sum_{\alpha \in \mathcal{A}_i} c_\alpha^2\, \mathbb{E}[\Psi_\alpha^2]}{\sum_{\alpha \neq 0} c_\alpha^2\, \mathbb{E}[\Psi_\alpha^2]}$$

All Sobol indices follow in closed form from the PCE coefficients alone, with no additional simulation required. This property (Sudret 2008) is the central reason PCE has become a standard in UQ.

Appendix B. Orthogonal Polynomials

B.1 Definition

Given an interval $[a, b]$ and a non-negative weight $w(x)$, a sequence $\{\phi_0, \phi_1, \phi_2, \ldots\}$ of polynomials is called an orthogonal polynomial sequence with respect to $w(x)$ if

$$\langle \phi_i, \phi_j \rangle_w := \int_a^b \phi_i(x)\,\phi_j(x)\,w(x)\,dx = h_i\,\delta_{ij}$$

where $h_i > 0$ are normalization constants and $\delta_{ij}$ is the Kronecker delta. If $h_i = 1$, the system is orthonormal.

B.2 Key Properties

(1) Three-term recurrence. Every orthogonal polynomial system satisfies a recurrence of the form

$$\phi_{n+1}(x) = (A_n x + B_n)\,\phi_n(x) – C_n\,\phi_{n-1}(x)$$

which makes evaluation numerically stable and efficient.

(2) Distribution of zeros. $\phi_n(x)$ has exactly $n$ simple real zeros inside $[a, b]$, which serve as the nodes of Gaussian quadrature.

(3) Best approximation. In the $L^2_w$ norm, the best polynomial approximation of $f$ of degree at most $n$ is

$$f_n^*(x) = \sum_{k=0}^n \frac{\langle f, \phi_k \rangle_w}{h_k}\,\phi_k(x)$$

each coefficient determined independently of the others — the orthogonal projection.

B.3 Representative Polynomials

Name	Interval $[a,b]$	Weight $w(x)$	Initial Recurrence
Legendre $P_n$	$[-1, 1]$	$1$	$P_0=1, P_1=x$
Chebyshev (1st kind) $T_n$	$[-1, 1]$	$(1-x^2)^{-1/2}$	$T_0=1, T_1=x$
Hermite $H_n$ (probabilists’)	$(-\infty, \infty)$	$e^{-x^2/2}$	$H_0=1, H_1=x$
Laguerre $L_n$	$[0, \infty)$	$e^{-x}$	$L_0=1, L_1=1-x$
Jacobi $P_n^{(\alpha,\beta)}$	$[-1, 1]$	$(1-x)^\alpha(1+x)^\beta$	$P_0=1$
Zernike $Z_n^m(r,\theta)$	Unit disk	$1$	2D, separable in radius and angle

B.4 Wiener-Askey Mapping (Distribution ↔ Orthogonal Polynomial)

Distribution	Weight = PDF	Orthogonal Polynomial
Gaussian	$\propto e^{-x^2/2}$	Hermite
Uniform on $[-1,1]$	$1/2$	Legendre
Gamma / Exponential	$\propto e^{-x}$	Laguerre
Beta	$\propto (1-x)^\alpha(1+x)^\beta$	Jacobi
Poisson	discrete	Charlier
Binomial	discrete	Krawtchouk

This mapping (Xiu & Karniadakis 2002) is the basis on which PCE automatically chooses the orthogonal polynomial family that matches the input distribution.

B.5 Why Orthogonal Polynomials Are Powerful in PML

Numerical stability: The monomial basis $\{1, x, x^2, \ldots\}$ becomes nearly parallel on $[0,1]$, producing an ill-conditioned Vandermonde regression matrix. Orthogonal bases avoid this.
Coefficient interpretability: Each coefficient directly represents the strength of its corresponding polynomial mode.
Modularity in adding terms: Adding higher-order terms does not alter existing coefficients, enabling adaptive modeling.
Variance decomposition: ANOVA-style decomposition follows automatically — Sobol indices and sensitivity analysis become direct.
Compatibility with Gaussian quadrature: The zeros of the polynomials serve as quadrature nodes, making integrals and expectations efficient.

References

Blatman, G., & Sudret, B. (2011). Adaptive sparse polynomial chaos expansion based on least angle regression. Journal of Computational Physics, 230(6), 2345–2367.
Blondel, M., Fujino, A., Ueda, N., & Ishihata, M. (2016). Higher-order factorization machines. Advances in Neural Information Processing Systems, 29.
Brunton, S. L., Proctor, J. L., & Kutz, J. N. (2016). Discovering governing equations from data by sparse identification of nonlinear dynamical systems. Proceedings of the National Academy of Sciences, 113(15), 3932–3937.
Chrysos, G. G., Moschoglou, S., Bouritsas, G., Panagakis, Y., Deng, J., & Zafeiriou, S. (2020). P-Nets: Deep polynomial neural networks. IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).
Cranmer, M. (2023). Interpretable machine learning for science with PySR and SymbolicRegression.jl. arXiv preprint arXiv:2305.01582.
Ghanem, R. G., & Spanos, P. D. (1991). Stochastic Finite Elements: A Spectral Approach. Springer.
Ivakhnenko, A. G. (1971). Polynomial theory of complex systems. IEEE Transactions on Systems, Man, and Cybernetics, SMC-1(4), 364–378.
Kennedy, M. C., & O’Hagan, A. (2000). Predicting the output from a complex computer code when fast approximations are available. Biometrika, 87(1), 1–13.
Konakli, K., & Sudret, B. (2016). Polynomial meta-models with canonical low-rank approximations: Numerical insights and comparison to sparse polynomial chaos expansions. Journal of Computational Physics, 321, 1144–1169.
Liu, Z., Wang, Y., Vaidya, S., Ruehle, F., Halverson, J., Soljačić, M., Hou, T. Y., & Tegmark, M. (2024). KAN: Kolmogorov-Arnold networks. arXiv preprint arXiv:2404.19756.
Loève, M. (1978). Probability Theory II (4th ed.). Springer.
Myers, R. H., Montgomery, D. C., & Anderson-Cook, C. M. (2016). Response Surface Methodology: Process and Product Optimization Using Designed Experiments (4th ed.). Wiley.
Oh, S. K., & Pedrycz, W. (2002). Self-organizing polynomial neural networks based on polynomial and fuzzy polynomial neurons: Analysis and design. Fuzzy Sets and Systems, 142(2), 163–198.
Oladyshkin, S., & Nowak, W. (2012). Data-driven uncertainty quantification using the arbitrary polynomial chaos expansion. Reliability Engineering & System Safety, 106, 179–190.
Raissi, M., Perdikaris, P., & Karniadakis, G. E. (2019). Physics-informed neural networks: A deep learning framework for solving forward and inverse problems involving nonlinear partial differential equations. Journal of Computational Physics, 378, 686–707.
Rendle, S. (2010). Factorization machines. IEEE International Conference on Data Mining, 995–1000.
Schöbi, R., Sudret, B., & Wiart, J. (2015). Polynomial-chaos-based Kriging. International Journal for Uncertainty Quantification, 5(2), 171–193.
Sobol, I. M. (1993). Sensitivity estimates for nonlinear mathematical models. Mathematical Modelling and Computational Experiments, 1(4), 407–414.
Sudret, B. (2008). Global sensitivity analysis using polynomial chaos expansions. Reliability Engineering & System Safety, 93(7), 964–979.
Tibshirani, R. (1996). Regression shrinkage and selection via the lasso. Journal of the Royal Statistical Society: Series B, 58(1), 267–288.
Xiu, D., & Karniadakis, G. E. (2002). The Wiener-Askey polynomial chaos for stochastic differential equations. SIAM Journal on Scientific Computing, 24(2), 619–644.
Zou, H., & Hastie, T. (2005). Regularization and variable selection via the elastic net. Journal of the Royal Statistical Society: Series B, 67(2), 301–320.

Our Score

Click to rate this post!

[Total: 1 Average: 2]

Visited 5 times, 1 visit(s) today